Literature DB >> 33936955

Genome wide identification, phylogeny, and synteny analysis of sox gene family in common carp (Cyprinus carpio).

Imran Zafar1, Rida Iftikhar1, Syed Umair Ahmad2, Mohd Ashraf Rather3.   

Abstract

Common carp (Cyprinus carpio) is a commercial fish species valuable for nutritious components and plays a vital role in human healthy nutrition. The SOX (SRY-related genes systematically characterized by a high-mobility group HMG-box) encoded important gene regulatory proteins, a family of transcription factors found in a broad range of animal taxa and extensively known for its contribution in multiple developmental processes including contribution in sex determination across phyla. In our current study, we initially accomplished a genome-wide analysis to report the SOX gene family in common carp fish based on available genomic sequences of zebrafish retrieved from gene repository databases, we focused on the global identification of the Sox gene family in Common carp among wide range of vertebrates and teleosts based on bioinformatics tools and techniques and explore the evolutionary relationships. In our results, a total of 27 SOX (high-mobility group HMG-box) domain genes were identified in the C. carp genome. The full length sequences of SOX genes ranging from 3496 (SOX6) to 924bp (SOX17b) which coded with putative proteins series from 307 to 509 amino acids and all gene having exon number expect SOX9 and SOX13. All the SOX proteins contained at least one conserved DNA-binding HMG-box domain and two (SOX7 and SOX18) were found C terminal. The Gene ontology revealed SOX proteins maximum involvement is in metabolic process 49.796 %, average in biological regulation 45.188 %, biosynthetic process (19.992 %), regulation of cellular process 39.68, 45.508 % organic substance metabolic process, multicellular organismal process 23.23 %,developmental process 21.74 %, system development 16.59 %, gene expression 16.05 % and 14.337 % of RNA metabolic process. Chromosomal location and syntanic analysis show all SOX gene are located on different chromosomes and apparently does not fallow the unique pattern. The maximum linkage of chromosome is (2) on Unplaced Scaffold region. Finally, our results provide important genomic suggestion for upcoming studies of biochemical, physiological, and phylogenetic understanding on SOX genes among teleost.
© 2021 Published by Elsevier B.V.

Entities:  

Keywords:  Common carp; Conserved domains; Fishes; Genome wide analysis; Phylogenetic tree; Protein motif; SOX genes; Sex determination; Syntenic analysis

Year:  2021        PMID: 33936955      PMCID: PMC8076717          DOI: 10.1016/j.btre.2021.e00607

Source DB:  PubMed          Journal:  Biotechnol Rep (Amst)        ISSN: 2215-017X


Introduction

Sex determination (SD) and differentiation pattern are rudimentary developmental processes, which have been commonly seen with transcriptional factors between wide range diversity of vertebrates and invertebrates [1]. Different genetic and environmental factors are involved to facilitate the tools and technologies for investigation of sex determination mechanism and differentiation patterns. Among vertebrates, gonadal differentiation and determination are accomplished by studying the interaction between complex networks of transcription factors [2]. High diversity of SD mechanisms observed in fish is connected to the high turnover rate of their sex chromosomes, that’s why they are considered as young and labelled as homomorphic. Sex chromosomes are interrelated with gonadal differentiation and their turnover, which turn to appear with new master genes for sex determination [3]. Approximately half of the vertebrates are includes with fishes and shows great variation in sex determination process [3], [4]. In vertebrates including fish, genes are reported related to sex determination mechanisms including AMHR2, GDF6, DMRT genes having conserved DNA binding DM domains [5], and SOX gene family with HMG domain (high-mobility group) play important role in sex determination and differentiation. The SOX gene family is a group of evolutionary transcriptional factors that involves in various developmental process, not only in sex determination and sex differentiation process but also the formation of multiple organs, endoderm development, eyes [6], [7] angiogenesis, gonad [8], [9], chondrogenesis, neurogenesis [10], [11], [12], cardiogenesis [13], [14], cartilage [15], [16] and pancreas [17], [18]. Consideration of SOX genes identification was initiative step with the discovery of Sry (testis determining factor) in mammals [19], [20] which brings out high mobility group HMG domain specifically binds with DNA sequences [21]. SOX genes having 80 % sequence identity with HMG domain and commonly conserved in SOX 1 to SOX 32. With the facility of whole-genome sequencing (WGS) and genome wide characterization (GWC), more than 40 members of SOX family and diverse varieties of proteins have been identified and analyzed in mammals, birds, amphibians, reptiles, insects and fishes [22], [23]. For instance, earlier researcher report the variability of SOX genes have been found in various organism like 8 SOX genes in drosophila, 19 SOX genes in Japanese medaka, over 20 SOX genes in mouse and human [24], and 27 SOX genes in Nile tilapia [25]. The SOX gene family is divide in different groups and subgroups among higher vertebrates and teleost based on similarity of DNA or protein sequences [23], [26], [28]. Earlier studies on growth and development of teleost fishes have explored the potential roles of SOX genes for example SOX1, SOX2, and SOX3 proteins sequence bind specifically with DNA via HMG domain and act as a transcription factor or involves in protein-protein interactions such as POU proteins binding with other proteins [29]. The SOX1 and SOX3 has a C-terminal region, which facilitate it to act as a transcriptional factor by developing protein-protein complex and sox1 expressed in development process for formation of neural plate [30]. The expression of Sox2 and SOX3 help in maintaining the identity of progenitor cell by inhibiting neurogenesis [29]. In mammals and fish sox6 act as transcription factor to controls the identity of skeletal muscle fiber [31]. The Sox8, sox9 and sox 10 involved in many developmental processes like testis development maintenance of male fertility, humans early developing sate of gonads, expression in somatic cells and sex determination with the help of DMRT1 gene [32], [33]. Sox9 is an essential transcription regulatory factor for the development of adult cartilage and activates the genes transcriptional factor for structural components [34]. In same contrast the many members of SOX gene families are identified and able to share some common biochemical properties and function overlapping in different biological contexts Sarkar et al., 2013. Among all available types of SOX genes, SOX9 is known to be most important transcription factor necessary for Sex differentiation in vertebrates and have two variant forms SOX9a and SOX 9b in teleost’s [2]. Documentation of comparative genomic involves in evaluation and analysis of gene sequences and regulatory regions between different species, and provides a novel approach to identify the Common carp SOX genes. Computational approach for whole genome analysis is becoming more common not only for biological science but also for aquatic biotechnology research topics. Our study identifies the presence of SOX genes in the common carp (Cyprinus carpio) to indicate the difference between genes among different species and to provide genomic resources on the very best SOX genes for future work. Our aim of this present study were to recognize the sufficiency of the SOX gene family in common carp (Cyprinus carpio) fish, associated with the gene divergence between different species having diverse environmental factors and to provide genomic resources for future work on SOX genes. Here we utilized all the accessible gene resources from the model fish and reported the whole genome identification of SOX gene family in common carp (Cyprinus carpio) fish. Prediction of the structure of the sequences and functional domains of the SOX genes will be carried out, followed by a phylogenetic and structural analysis. Our systematic study of the SOX genes will also provide some basic genomic resources that may be remain available to better understand the evolutionary and physiological aspects of whole post-genome duplication (WGD) in common carp. However, genome-wide analyses on SOX gene family are scarce in fish

Material and methods

The availability of all SOX genes in Zebrafish (Danio rerio) genome provide the facility to check the candidate genes downloaded from Ensembl (https://asia.ensembl.org/Danio_rerio/Info/Index) given in (Table 1). The Zebrafish genes were used as query sequences to search against the Common carp (Cyprinus carpio) genome accessible in NCBI genome database (https://www.ncbi.nlm.nih.gov/genome) for identifying candidate genes. We used different strategies on available genomic resources of Common Carp including whole genome sequences and retrieve cDNAs by BLAST searches with 1e-10 E values, according to the earlier reports [20], [35]. Then, reciprocal BLAST searches were conducted using the candidate common carp SOX genes as queries, to confirm the accuracy of the candidate genes. Furthermore, the coding sequences were confirmed by BLASTN searches against the NCBI non-redundant protein sequence database (nr). Subsequently, the conserved HMG-box (PDOC00305) domain was searched against the local database by BLASTp program [36], and manually deletes irrelevant sequences. The SOX proteins sequences from other organisms were retrieved from the NCBI (https://www.ncbi.nlm.nih.gov/) genome databases using BLAST program with an threshold of 1e-10 and mapped all protein genes on individual chromosomal locations. The Pfam (https://pfam.xfam.org/), SMART (http://smart.embl-heidelberg.de/), and other databases were utilized to check the candidate sequences that contained HMG-box domains [37].
Table 1

Summary of SOX genes in Danio rerio: All available SOX genes of zebrafish (Danio rerio) downloaded from Ensembl (http://asia.ensembl.org/index.html) were used as query sequences to search against the Common carp (Cyprinus carpio). Table consistent of Accession number of genes, name of Sox genes, Genomic length(bp), CDS (bp), CDS of amino acids, No. of exons, Chromosomes, Genome location and Assembly of organism amiable for open access. In above mentioned table chromosomes 7 contain highest number of SOX gene like (SOX6, SOX7, SOX9b and SOX32), chromosome 3 consist of three SOX genes (SOX8b, SOX9b and SOX10), Chromosomes 11 have only two SOX genes (SOX12 and SOX12), and reaming all SOX genes are consist of individual chromosomes.

Gene IDSox genesGenomic length(bp)CDS (bp)CDS (aa)CDS statusNo. of exonsChromosomes NumberGenome locationAssembly
ZDB-GENE-040718-186Sox1a19451010336Complete19NC_007120.7 (21722734..21724661)GRCz11
ZDB-GENE-060322-5Sox1b17731022340Complete11NC_007112.7 (46194333..46196106)GRCz11
ZDB-GENE-030909-1Sox22120947315Complete122NC_007133.7 (37347893..37349978)GRCz11
ZDB-GENE-980526-333Sox31781902300Complete114NC_007125.7 (32742701..32744464)GRCz11
ZDB-GENE-030131-8290Sox4a33501091363Complete119NC_007130.7 (28786149..28789409)GRCz11
ZDB-GENE-040426-1274Sox4b30151028342Complete216NC_007127.7 (68069..71110)GRCz11
ZDB-GENE-000607-13Sox523932279759Complete304NC_007115.7 (16981414..17244201)GRCz11
ZDB-GENE-081120-6Sox61242410136Incomplete147NC_007118.7 (27320462..27449391)GRCz11
ZDB-GENE-040109-4Sox722461172390Complete220NC_007131.7 (19075127..19078424)GRCz11
ZDB-GENE-130530-719Sox8a20001205401Complete412NC_007123.7 (1072485..1085503)GRCz11
ZDB-GENE-031114-1Sox8b19341076358Complete33NC_007114.7 (62397711..62403488)GRCz11
ZDB-GENE-001103-1Sox9a17901,388‬462Complete312NC_007123.7 (1947593..1951233)GRCz11
ZDB-GENE-001103-2Sox9b19161232407Complete33NC_007114.7 (62522542..62527667)GRCz11
ZDB-GENE-011207-1Sox1032311457485Complete43NC_007114.7 (1492174..1501252)GRCz11
ZDB-GENE-980526-395Sox11a32801064354Complete117NC_007128.7 (35878547..35881807)GRCz11
ZDB-GENE-980526-466Sox11b26661106368Complete120NC_007131.7 (30032636..30035292)GRCz11
ZDB-GENE-040724-33Sox1214011067355Complete211NC_007122.7 (24190164..24191928)GRCz11
ZDB-GENE-100519-1Sox1353801790596Complete1411NC_007122.7 (37768409..37831781)GRCz11
ZDB-GENE-051113-268Sox141935716238Complete16NC_007117.7 (26557979..26559883)GRCz11
ZDB-GENE-991213-1Sox1717051241413Complete27NC_007118.7 (58773148..58776152)GRCz11
ZDB-GENE-080725-1Sox1825451295331Complete223NC_007134.7 (8797002..8800754)GRCz11
ZDB-GENE-980526-102Sox19a2181893297Complete25NC_007116.7 (24199180..24201444)GRCz11
ZDB-GENE-010111-1Sox19b2556881293Complete27NC_007118.7 (26494585..26497947)GRCz11
ZDB-GENE-990715-6Sox21a1098719239Complete16NC_007117.7 (7414273..7415343)GRCz11
ZDB-GENE-040429-1Sox21b1374737245Complete19NC_007120.7 (53276356..53277702)GRCz11
ZDB-GENE-011026-1Sox321222923307Complete27NC_007118.7 (58824526..58826178)GRCz11
Summary of SOX genes in Danio rerio: All available SOX genes of zebrafish (Danio rerio) downloaded from Ensembl (http://asia.ensembl.org/index.html) were used as query sequences to search against the Common carp (Cyprinus carpio). Table consistent of Accession number of genes, name of Sox genes, Genomic length(bp), CDS (bp), CDS of amino acids, No. of exons, Chromosomes, Genome location and Assembly of organism amiable for open access. In above mentioned table chromosomes 7 contain highest number of SOX gene like (SOX6, SOX7, SOX9b and SOX32), chromosome 3 consist of three SOX genes (SOX8b, SOX9b and SOX10), Chromosomes 11 have only two SOX genes (SOX12 and SOX12), and reaming all SOX genes are consist of individual chromosomes.

Gene characterization and structure

To characterize the common carp genes structures, compare them with their orthologs in zebrafish (Danio rerio) and human (Homo sapiens) genome, we first accomplished exon-intron structure analysis with online analysis tool Gene Structure Display Server 2.0 (http://gsds.cbi.pku.edu.cn/). The Multiple Motif Elicitation for Motif Elicitation (MEME) 5.05 (http://meme-suite.org/tools/meme) used to recognized conserved motif regions of protein sequences. MEME (Multiple Ex-pectation-Maximization for Motif Elicitation) is online suite containing with motif discovery and searching tools [38]. We use default setting of MEME tools for performing our motif analysis with maximum 10 different ranges of motif on individual sequence among 50 optimum widths for each motif.

Proteomics analysis of SOX gene family

The size of individual protein sequences, molecular weight (MW), extinction co-efficient and theoretical isoelectric point (pI) of each SOX protein were calculated by using the online tool ProtParam (http://www.protparam.net/index.html), the web based ExPASy Proteomics Server and sequence analysis performed on PSIPRED 4.0 (http://bioinf.cs.ucl.ac.uk/psipred/) server. Discovered MEME motifs were searched Expasy-Prosite database in ScanProsite (https://prosite.expasy.org/scanprosite/) tool linked with knowledge based database was used to predict the conserved domain (CD) based on sequence homology (SH), which was further confirm or analyzed by BLAST with N program (https://blast.ncbi.nlm.nih.gov/Blast).

Protein-protein and Gene Ontology (GO) of SOX genes

Here we obligate to carry out direct (Physical) or indirect (Functional) connotation of protein-protein network interactions (PPNIs) for all SOX genes identified in common carp genome was done with the help of online available databases like STRIG (Search Tool for the Retrieval of Interacting Genes/Proteins) (https://string-db.org/), Pfam (https://pfam.xfam.org/), and SMART (http://smart.embl-heidelberg.de/) etc. or interactively represent using genemania online available desktop based software and tools.

Chromosome mapping and synteny analysis

For physical map of individual chromosomes protocol is design to find the actual location, data of all SOX genes (Common carp, Zebrafish and Human) was downloaded from NCBI (https://www.ncbi.nlm.nih.gov/) and Ensemble (http://asia.ensembl.org/index.html) databases. For the graphical demonstration of linkage maps and quantitative trait locus (QTLs) used Mapchart 2.32 software to map onto the individual chromosomes of common carp. The syntenic analysis of SOX paralogs pairs were retrieve or identified by searching the gene duplication among all the species at NCBI and Ensembl database and Map individually with map chart software.

Sequence alignment and phylogenetic analysis

The DNA sequences for HMG box, all SOX genes of Common carp genes were aligned on Clustal Omega (https://www.ebi.ac.uk) by online available software T-Coffee (http://tcoffee.crg.cat) with default parameters were extracted base on the SMART analysis [37]. The amino acid sequences of protein from 10 species including Human, Mouse, Dog, and Chimpanzee, Nile tilapia, gold fish, Medaka, Rainbow trout, Guppy, and Atlantic salmon were aligned with ClustalX program [39]. For phylogenetic analysis neighbor-joining method was conducted with reference of SOX proteins from Common Carp to construct phylogenetic tree with a bootstrap test of 1000 replicates by using MEGA 7.0 program [40]. For interactive representation of both tree we used interactive Tree Of Life (iTOL) online available software (https://itol.embl.de) to display phylogenetic tree with different color coded ranges [41].

Results

Identification of SOX genes in common carp

We identified a total of 27 SOX genes in the common carp genome using all available genomic resources, including SOX1a, SOX1b, SOX2, SOX3, SOX4a, SOX4b, SOX5, SOX6, SOX7, SOX8a, SOX8b, SOX9, SOX9a, SOX9b, SOX10, SOX10a, SOX10b, SOX11a, SOX11b, SOX13, SOX14, SOX17b, SOX18, SOX19a, SOX19b, SOX21a and SOX21b from Zebrafish given in (Table 2). The Complete material and evidence on their corresponding genomic sequences, coding sequences (CDS) and total number of exons is précised in Table 2. The maximum genomic length of SOX6 was 3496bps and coding region of SOX9 and SOX18 were 1373bp or 1328bp which encodes 457 and 442 amino acids, in detail we categorized the coding sequence according to length of nucleotide sequences which included 1000pb genes are in cluster one SOX1a (1010), SOX1b (1007), SOX4a (1046), SOX11a (1061), SOX11b (1064), the genes start from 1100bp are in second cluster SOX10b (1151), SOX10a (1145), SOX13 (1187), SOX7 (1193), third cluster include with two genes having diverse length of coding sequence SOX8a (1265) and SOX9b (1316) and at the end fourth cluster start for minimum genomic length of coding sequences from 413 to maximum length 944. The SOX9 and SOX13 has no exon number, SOX1a, SOX2, SOX3, SOX4a, SOX4b, SOX10, SOX21a, SOX21b individual genes has one exons respectively, SOX7, SOX11a, SOX11b, SOX18, SOX19a, SOX19b having same exon number such as two, SOX8a, SOX8b, SOX9a, SOX9b, SOX10a, SOX10b, SOX14, SOX17b has three exons. The SOX6 has four exons and SOX5, SOX1b have same exon number i.e. 6. We observe that the genes of SOX families were located individual chromosome such as SOX 8a on chromosome number 5, SOX19a (9), SOX14 (10), SOX13 (22), SOX9a (23), SOX9b (30), SOX9 (31), SOX3 (35), SOX8b (41), SOX2 (44), and remaining all are located on Unplaced Scaffold region of common carp genome. The comparative analysis of common carp SOX genes with other higher vertebrates (Human, Mouse, Dog, and Chimpanzee) and fishes (Nile tilapia, gold fish, Medaka, Rainbow trout, Guppy, and Atlantic salmon) are given in (Table 3). The Presence of SOX genes on Chromosomal position of individual species of teleost and other vertebrate were identified and given in (Table 4). The chromosome with star (*) indicate the presence of highly conserved gene and similar variant present on chromosome place (Table 5).
Table 2

Summery of all SOX genes in Resulted SOX genes in Common carp (Cyprinus carpio) based on query sequence of all available SOX genes of zebrafish (Danio rerio) in Ensembl database online available for public access. Table contain with information of Accession number of genes, name of Sox genes, Genomic length(bp), CDS (bp), CDS of amino acids, No. of exons, Chromosomes, Genome location and Assembly of organism amiable for open access. In which above mentioned table we observe that all SOX genes are consistent of induvial chromosomes and reaming all are on unplaced region.

ANSox genesGenomic length(bp)CDS (bp)CDS (aa)No. of exonsChromosomes NumberGenome locationAssembly
XM_019075156.1Sox1a220910103361UnplacedNW_017538026.1 (90488..92696)GCF_000951615.1
XM_019072343.1Sox1b135710283421UnplacedNW_017537908.1 (994268..995625,GCF_000951615.1
XM_019065499.1Sox22120947315144NC_031740.1 (10639735..106418iiiii53)GCF_000951615.1
XM_019122086.1Sox31505899299135NC_031731.1 (13437760..13439264)GCF_000951615.1
XM_019079512.1Sox4a167010463481UnplacedNW_017538201.1 (270388..272057)GCF_000951615.1
XM_019069830.1Sox4b15039293091UnplacedNW_017537801.1 (274672..276174)GCF_000951615.1
XM_019089698.1Sox59258182726UnplacedNW_017540372.1 (84204..107152GCF_000951615.1
XM_019083302.1Sox634965091694UnplacedNW_017538301.1 (846107..850788,GCF_000951615.1
XM_019069299.1Sox7157111933972UnplacedNW_017537780.1 (246443..249212)GCF_000951615.1
XM_019064819.1Sox8a2568126542135NC_031701.1 (904555..907715)GCF_000951615.1
XM_019063104.1Sox8b962944314341NC_031737.1 (4710056..4713500)GCF_000951615.1
DQ201318.1Sox921071373457031NC_031727.1 (12689194..12693097(GCF_000951615.1)
XR_002019411.1Sox9a2906637457323NC_031719.1 (41616..44965GCF_000951615.1
XM_019117253.1Sox9b25341316438330NC_031726.1 (7371120..7374563)GCF_000951615.1
XM_019092904.1Sox105234431471UnplacedNW_017541515.1 (421..943)GCF_000951615.1
MF573939.1Sox10a322111454813UnplacedNW_017543385.1 (111150..116572(GCF_000951615.1)
MF538663.1Sox10b316711514833UnplacedNW_017543385.1 (111150..116572GCF_000951615.1
XM_019096628.1Sox11a127110613532UnplacedNW_017542566.1 (213843..215409)GCF_000951615.1
XM_019096626.1Sox11b172810643542UnplacedNW_017542566.1 (357335..359362)GCF_000951615.1
XM_019083671.1Sox131341118739522NC_031718.1 (7903690..7921638)GCF_000951615.1
XM_019104301.1Sox142000716238310NC_031706.1 (7744712..7748222GCF_000951615.1
XM_019086643.1Sox17b9249233073UnplacedNW_017539346.1 (376..1868)GCF_000951615.1
XM_019069992.1Sox18160913284422UnplacedNW_017537809.1 (403527..406325)GCF_000951615.1
XM_019097725.1Sox19a134589029629NC_031705.1 (5973083..5974515)GCF_000951615.1
XM_019075219.1Sox19b18818692892UnplacedNW_017538029.1 (162955..165835GCF_000951615.1
XM_019072662.1Sox21a10667162381UnplacedNW_017537923.1 (191505..192570GCF_000951615.1
XM_019095743.1Sox21b23107282421UnplacedNW_017542346.1 (20482..22791GCF_000951615.1
Table 3

Comparative analysis of SOX genes of “Cyprinus carpio” with vertebrate’s species: in our finding Star (*) indicate that highly conserved variant with gene having 90 to 100 similarity.

Gene NameZebrafishCommon carpNile tilapiaGold fishMedakaRainbow troutGuppyAtlantic salmonChimpanzeeHumanMouseDog
Sox-1a121 + 2*21 + 1*41 + 4*2 + 3*11*1*1*
Sox-1b121 + 2*21 + 1*4*1 + 2*3 + 2*11*1*1*
Sox21211 + 1*11 + 1*1 + 1*1 + 1*1111
Sox31111 + 1*12121111
Sox4a11 + 1*1 + 1*1 + 4*2*2 + 4*1 + 1*5*11*11*
Sox4b11 + 1*1 + 1*1 + 4*1 + 1*2 + 4*1 + 1*5*11*11*
Sox513 + 1*1 + 4*4 + 4*1 + 9*21 + 8*2 + 20*7*1 + 3*1 + 20*1 + 6*
Sox6131 + 8*2 + 5*132 + 3*4 + 2*21*1 + 2*1 + 11*17*
Sox713141 + 1*1 + 1*1 + 1*31111
Sox8a111 + 1*1 + 3*11 + 2*2*1 + 5*11*1*1*
Sox8b111 + 1*1 + 3*1*1 + 2*2*1 + 5*11*1*1*
Sox91122 + 3*14 + 2*1 + 1*5*1111
Sox9a111 + 1*2 + 3*2*2*1 + 1*2 + 2*11*1*1*
Sox9b121 + 1*2 + 3*2*2*2*1 + 4*11*1*1*
Sox1013331 + 2*4141111
Sox 10a11*1 + 1*1 + 2*1 + 2*4*2*4*11*1*1
Sox 10b11*1 + 1*1 + 2*1 + 2*4*1 + 1*4*11*1*1
Sox11a11*1 + 1*4 + 2*13*1 + 1*3*11*1*0
Sox11b11*1 + 1*4 + 2*13*1 + 1*3*11*1*0
Sox131311 + 1*1112 + 3*6*21 + 4*1
Sox1413231 + 1*11 + 1*1 + 1*21111
Sox17b111 + 1*2*1*1*1 + 1*4*11*2*1
Sox1812120211 + 1*1111
Sox19a1112 + 2*1*2 + 2*1*2 + 2*0000
Sox19b1212 + 2*11 + 3*11 + 3*0000
Sox21a1212 + 2*1*2*1*2*11*10
Sox21b11*12 + 2*11 + 1*1211*10
Total274131531835223521111412
Table 4

Presence of SOX genes on Chromosomes of individual species: chromosome with star (*) indicate the presence of highly conserved and similar variant present on mentioned chromosome place.

Gene NameZebrafishCommon carpNile tilapiaGold fishMedakaRainbow troutGuppyAtlantic salmonHumanMouseDog
Sox-1a9UnplacedLG1692122LG21613*8*0
1825
Sox-1b1UnplacedLG23up23LG23*1713*8*0
67
Sox22244LG174744LG4233334
Sox31435391014LG109XxX
255
Sox4a19UnplacedLG221916*14LG1612*6*13*35
14*
Sox4b16UnplacedLG11161618LG1112*6*13*27
41*
Sox54UnplacedLG1742321LG131712627
29*7
Sox67UnplacedLG 1762LG31111721
30*3*426
Sox720UnplacedLG 15208LG216814
45*15
Sox8a125LG 828813LG8*2816*17*8*
23
Sox8b341LG 4LG3720LG8*316*17*8*
Sox931LG 428813LG828*17119
Sox9a1223LG 4128*13*LG8617*11*9*
37*
Sox9b330LG 812*8*13*LG8*1917*11*9*
Sox103Unplaced3813LG83221510
Sox 10a3 *UnplacedLG 6LG28*112*LG8*12*22*15*10*
Sox 10b3*UnplacedLG 4LG28*817*LG8*3*22*15*10*
Sox11a17UnplacedLG 1917*2217*LG221*2*038*
Sox11b20UnplacedLG 1545*2219*LG2115*2*038*
Sox131122LG 511517LG5221138
Unplaced36*12
Sox14610LG 18*2428LG433923
Unplaced27*
31*
Sox17b7*UnplacedLG 9*2015*LG20198*1*29*
Sox1823UnplacedLG 20up716LG71520224
Sox19a593018*1018*4000
210
Sox19b7UnplacedLG 3321821LG187000
Sox21a6UnplacedLG 16*621*3LG21*25*13*14*0
31*
Sox21b9UnplacedLG 1692122LG22112*14*0
34*
Table 5

Detailed characteristics information of SOX genes in Cyprinus carpio. The molecular weight, Weight in Kilodaltons, theoretical PI, Extinction Coefficient, Number of Atom and N-glycosylation sites of individual SOX genes.

Gene NameMolecular weightWeight in Kilodaltons (kDa)PI (Isoelectric Point)Extinction CoefficientTotal Number of AtomN-glycosylation sites
Sox-1a36152.488136.169.703736049562
Sox-1b35946.333136.79.703736049322
Sox234542.933134.559.743736047312
Sox333326.736133.339.633736045893
Sox4a37999.9801386.083243052301
Sox4b33945.786133.958.593392047050
Sox530049.805630.069.072143042330
Sox619299.677619.35.451341026531
Sox743535.209943.546.144332059661
Sox8a46255.001035.116.645879063460
Sox8b35105.906446.266.653541048260
Sox950794.764550.816.135581069651
Sox9a50794.764550.816.135581069651
Sox9b48461.215648.476.075730066400
Sox1015735.514715.745.611948021600
Sox 10a51135.773251.146.275432069851
Sox 10b51292.902051.36.205432070031
Sox11a39908.266139.915.263839054913
Sox11b39892.238139.95.153839054813
Sox1344313.014544.337.911845061913
Sox1426673.461026.689.693289037131
Sox17b35357.958735.367.303736048542
Sox1847999.507348.016.814389066242
Sox19a32763.918832.779.664034045100
Sox19b31966.147131.979.614183043980
Sox21a26512.528226.519.743289037052
Sox21b26662.827126.679.693140037110
Summery of all SOX genes in Resulted SOX genes in Common carp (Cyprinus carpio) based on query sequence of all available SOX genes of zebrafish (Danio rerio) in Ensembl database online available for public access. Table contain with information of Accession number of genes, name of Sox genes, Genomic length(bp), CDS (bp), CDS of amino acids, No. of exons, Chromosomes, Genome location and Assembly of organism amiable for open access. In which above mentioned table we observe that all SOX genes are consistent of induvial chromosomes and reaming all are on unplaced region. Comparative analysis of SOX genes of “Cyprinus carpio” with vertebrate’s species: in our finding Star (*) indicate that highly conserved variant with gene having 90 to 100 similarity. Presence of SOX genes on Chromosomes of individual species: chromosome with star (*) indicate the presence of highly conserved and similar variant present on mentioned chromosome place. Detailed characteristics information of SOX genes in Cyprinus carpio. The molecular weight, Weight in Kilodaltons, theoretical PI, Extinction Coefficient, Number of Atom and N-glycosylation sites of individual SOX genes.

Physiochemical properties of SOX proteins

We check Physiochemical properties of individual SOX proteins for further understanding in involvement of biological functions, having different parametric details i.e analysis of functional domains, molecular mass, potential N-glycosylation sites, theoretical PI and the extinction coefficient values. The molecular weight of most SOX proteins ranged from 31.97 to 51.14 kDa, theoretical PI of most SOX proteins were within 7.9.6.5. The Extinction Coefficient values of individual proteins observed or set in Table 4 in which various proteins founded in 37360–55810 in range. The number of potential Nglycosylation sites varied in Cyprinus carpio SOX proteins, ranging from 3 to 1 but total of nine genes (SOX4b, SOX5, SOX8a, SOX8b, SOX9b, SOX10, SOX19a SOX19b and SOX21b) were 0 Nglycosylation sites mentioned in (Table 4). A secondary structure of all SOX proteins annotation consists of stand, Coils, helix, extracellular, putative domain boundary and signal peptide 43 % to 15 % beta sheet, transmembrane, cysteine residues signal Peptide are summarized in Fig. 1. The functional domains and motif of all SOX genes were foreseen based on their sequences, all SOX gene show closed relation based on distance based phylogenetic suggestion and individual gene have a HMG (High mobility group) domains which are family of chromosomal proteins comparatively low molecular weight and non-histone mechanisms that bind with DNA among low sequence specific identity. We also perceive the individual SOX genes encompassing with high mobility group A and B (DNA-binding domains) which mean HMG1-A also called HMG-T in fish and HMG2-B are related genes that have two distinguishing features: two HMG boxes (A and B), homologous folded domains of around 80 amino acid residues, and a long acidic tail containing 20–30 aspartic or glutamic acid residues, all common carp SOX genes in HMG group-1 and two genes SOX7 and SOX18 SOX having C-terminal domain were founded (Fig. 2). For motif (super secondary structure) analysis we compare the protein sequences of three organism (common carp, zebrafish and human) to find the overall ten motifs mentioned with different color code (motif number, motif symbol with diverse ten colors and motif consensus mentioned in motif legend), all SOX genes according to ascending order have at least one and maximum ten motifs on single sequence with cut of values (p value) and motif location are given in Fig. 3).
Fig. 1

SOX protein annotation in Cyprinus carpio.

Fig. 2

SOX genes Binding Domains: Relationship of SOX genes in common carp based on distance made phylogenetic tree. The each SOX genes have a HMG (High mobility group) domains or SOX7 and SOX18 having C-terminal domains, to display phylogenetic tree with different color coded ranges labeled with induvial gene names.

Fig. 3

SOX genes in common carp and other vertebrates with their motif location and sequences.

SOX protein annotation in Cyprinus carpio. SOX genes Binding Domains: Relationship of SOX genes in common carp based on distance made phylogenetic tree. The each SOX genes have a HMG (High mobility group) domains or SOX7 and SOX18 having C-terminal domains, to display phylogenetic tree with different color coded ranges labeled with induvial gene names. SOX genes in common carp and other vertebrates with their motif location and sequences.

Retrieval of all SOX protein network form five big protein databases

For retrieval of all SOX protein networks we use genemania to retrieve the interaction of individual protein from different protein databases such 22 query related protein interaction involved with 44 proteins among 5 wide range attributes linked with 2035 individual proteins. The network were further categorized according to weightage of interaction mentioned in color coded lines as like share protein domains was 30.93 %, interaction from interpro (26.45 %),PFAM 18.35 %, SMART 7.70 %, Ensesmble protein family 6.62 %, super family 5.44 %, co expression among all proteins (4.09 %), and physical interaction among whole network was o.42 % showing in (Fig. 4)
Fig. 4

Retrieval of all Sox protein network form five big protein databases.

Retrieval of all Sox protein network form five big protein databases.

Protein-protein interaction and GO of all SOX proteins

For Protein-protein interaction we use sting database to retrieve the network interaction with 0.417 average clustering coefficients and observed total 20 numbers of nodes with 0.9 average degrees and 9 numbers of edges with 1 expected edge, many SOX protein were remain individual and this will interact many other protein as showing in Fig. 5b, but some of SOX proteins showing strong interaction network cluster with each other such as network 1 consistent of (SOX19a, SOX11a and SOX3), network 2 (SOX21a, SOX21b and SOX2), ad network 3 were including (SOX13, SOX6, SOX19b, SOX5, SOX32, and SOX10 proteins) as showing in Fig. 5a. For better understanding we divided All SOX proteins into 9 clusters by using K-mean cluster. The cluster was composed of closely connected proteins interaction, in which queried proteins were used to predict total of 13 functional partner proteins with total number of amino acid like pou5f3 (472 aa), runx2a (467 aa), nanog (384 aa), pou5f3 (472 aa), klf4a (352 aa), pax6a (461 aa), myca (409 aa), ctnnb1 (780 aa), klf4b (409 aa), and foxd3(371 aa) with p-value: < 1.0e-16 and score between 0.981 to 0.958. The balls giving unspecified effects in the interaction network and arrows shows positive action effect and one is show negative effects. The networks were divided into different colors clusters shown in Fig. 5b. The GO probability revealed based on five set of p values labeled with different colors (orange< = 1e-10, yellow 1e-10 to 1e-0, green 1e-0 to 1e-6, light blue1e-6 to 1e-4, and gray >0.01) for all SOX genes, to identify the maximum involvement of these genes were in metabolic process 49.796 %. Average SOX genes involved in biological regulation 45.188 %, biosynthetic process (19.992 %), regulation of cellular process 39.68, 45.508 % organic substance metabolic process, multicellular organismal process 23.23 %,developmental process 21.74 %, system development 16.59 %, gene expression 16.05 % and 14.337 % of RNA metabolic process as given in Fig. 6.
Fig. 5

a SOX Protein interaction (Protein-protein interaction) with other proteins. b SOX Protein interaction (Protein-protein interaction) with other proteins (K means clustering).

Fig. 6

Gene ontology analysis of all SOX proteins.

a SOX Protein interaction (Protein-protein interaction) with other proteins. b SOX Protein interaction (Protein-protein interaction) with other proteins (K means clustering). Gene ontology analysis of all SOX proteins.

Phylogenetic analysis of SOX genes

The phylogenetic analysis give extrapolation for all the SOX genes of common carp were grouped with their corresponding homologs from other higher vertebrates and fish species (Human (Homo sapiens), Mouse (Mus musculus), Dog(Canis lupus familiaris), Nile tilapia (Oreochromis niloticus), Gold fish (Carassius auratus), Chimpanzee (Pan troglodytes), Medaka (Oryzias latipes), Rainbow trout (Oncorhynchus mykiss), Guppy (Poecilia reticulata), and Atlantic salmon(Salmo salar), the indication of these groups are signifying that all protein sequences in SOX gene family are highly conserved except SOX19 of Common carp, Zebrafish, Nile tilapia and Goldfish or mix with SOX21. The specified higher vertebrates and teleost, SOX gene were assembled into individual clades shown with different color ranges, our candidate fish (Cyprinus carpio) indicate in phylogenetic tree with red color or found in all respective clads. Among the teleost cluster common carp close to Zebrafish(Danio rerio), Goldfish (Carassius auratus) and rainbow trout (Oncorhynchus mykiss) as given in Fig. 7. The multiple alignment of all SOX protein sequences of individual organism are given in supplementary file 1
Fig. 7

The phylogenetic analysis of all SOX genes of common carp with their corresponding homologs from other higher vertebrates and fish species.

The phylogenetic analysis of all SOX genes of common carp with their corresponding homologs from other higher vertebrates and fish species.

Synteny analysis of SOX genes

In Common carp SOX gene, we perceived all the gene for characterization ensuring close relation among each other with information of phylogenetic association and display intron and exon presence on individual gene excepting total of (6) genes does not have a intron structure such as (Acc. # DQ201318, XM_019063104.1) are present on first cluster in the start of the tree, two genes (Acc. # MF573939.1 and MF538663.1) present in middle of tree with same cluster weightage and in the latter two genes (Acc. # XM_019083671.1 and XM_019086643.1) are present in last of the tree and demonstrated in Fig. 8. All the SOX genes of common carp (Cyprinus carpio) were physically mapped among Zebrafish (Danio rerio) and Human (Homo sapiens) on the chromosomes level as per availability of acquired data from public resource databases and mentioned with different colors coded as like red identify common carp, green zebrafish and blue indicate human chromosomes. Among all chromosomes, the highest number of SOX genes (2) was found in common carp on unplaced region matched with many other chromosomes mentioned in zebrafish and human as like maximum (3) genes located on human chromosomes number 11, 12 and 22 and zebrafish contain at least one SOX gene, in detail SOX1B present on chromosomes number 1 in zebrafish and chromosome 13 in human, respectively all genes mapped according to their order and presence on respective chromosomes numbers mentioned in Fig. 9 except SOX9 in zebrafish and SOX19A and SOX19B in human because both they were absent in mentioned genome. The distribution pattern of SOX genes on chromosomes also identified with certain physical areas along with comparatively higher accumulation of SOX genes gene clusters.
Fig. 8

Phylogenetic association and presence intron and exon on individual gene.

Fig. 9

Common carp physically mapping of chromosomes among Zebrafish (Danio rerio) and Human (Homo sapiens).

Phylogenetic association and presence intron and exon on individual gene. Common carp physically mapping of chromosomes among Zebrafish (Danio rerio) and Human (Homo sapiens).

Discussion

Our main goal this current study was to identify the genome wide identification of highly conserved SRY-related HMG-box (SOX) genes in common carp genome with confirmation of in silico parametric techniques for gene characterization and structural insight of protein sequence’s with verification of phylogenetic investigation and syntenic analysis. In higher vertebrates SOX genes show diverse distribution pattern among different gene subfamilies and fishes including SOX proteins, such as channel catfish [42], tilapia [23], pufferfish [43], zebrafish [44] and medaka [20], based on conserved structure of protein domains and intron exon regions SOX1A, SOX1B, SOX2, SOX3, SOX4A, SOX4B, SOX5, SOX6, SOX7, SOX8A, SOX8B, SOX9, SOX9A, SOX9B, SOX10, SOX10A, SOX10B, SOX11A, SOX11B, SOX13, SOX14, SOX17B, SOX18, SOX19A, SOX19B, SOX21A AND SOX21B [3], [45], [46]. All SOX proteins including SOX1 to SOX 32, are a class of transcriptional regulators related sex determining factor SRY of mammals and other vertebrates [1]. In Genome wide distribution pattern due to evolutionary full genome repetition of SOX genes, these are reported in different teleost species, like 27 in Nile tilapia, Zebrafish (27), Pufferfish(25), Medaka (19), Human(20), Florida Lancelet (10) and 18, 18 in Western Clawed and chicken [23], [46], and no sox genes like SOX12, SOX15, SOX16 could be identified in certain teleost species [46], like sox 12 genes in acanthopterygian genomes and sox 30 in other fish orthologues. Until now in vertebrate more than 20 SOX different genes families are discovered based on conservancy of HMG‐box and classified with evidence of phylogenetic analysis [22]. SOX1 is highly functionally conserved with SOX1A, SOX1B, SOX2 and SOX3, all are identified in the genome of Homo sapiens [47], Rattus norvegicus [48], Gallus gallus [49], Xenopus tropicalis [50]. SOX5 and SOX6 identified in human, mouse and zebrafish (16, 17). SOX7, SOX17 and SOX18 are three closely related SOX proteins and identified in many corresponding genome like mouse, human and teleost (19) In the current study full length of SOX1A and SOX1B gene in common carp is 2209bp and 1357bp with 1010 bp or 1028bp open reading frame (ORF) which encoded in 336 and 342 amino acid (aa) sequences for protein. In Chinese sturgeon (Acipenser sinensis), full length of SOX1 is 2029 which encode with 343 amino acids [51]. In Nile tilapia SOX1A gene CDS length is 1533bp and protein encode with 344aa and SOXB gene encode with 354aa [23]. SOX1 has been already reported in broad range of animal taxa especially in teleost like Zebrafish, catfish, Medaka, Nile tilapia and rainbow trout [23], [46], [52]. In common carp SOX1A and SOX1B has 1, 1 exon, same in zebrafish but in the case of Nile tilapia SOX1A has only 1 and SOXB total have 7 exons [23]. The full-length cDNA of SOX2 gene is 997 encoded with 315aa having 1 exon, which is highly comparable with CDS of Indian major carps (Catla catla, Cirrhinus cirrhosis and Labeo rohita) and nile tilapia such as 936–997 nucleotides sequences with 315, 322 and 404aa [23], [53], [54]. Current work identified SOX such as SOX3 to SOX21 genes have diverse range CDS length like 716 to 1373bp, full-length in base pair is 523–3496 and putative protein sequences 147 to 483aa, pI>6, and Weight in Kilodaltons (kDa) ranges are 50, which are highly comparable with earlier reported SOX genes in fish species [23], [42], [53], [55]. All the SOX family encoded by protein sequences of common carp was analogous to other higher vertebrate species. It contains with highly conserved HMG domains and structural motif of the model zebrafish, have well conserved domains among the manifold species based on multiple protein sequences. Our present observations in all selected organisms are strongly aggregated with earlier reports confirming that the HMG domains of SOX gene family are highly conserved [53], [45]. The HMG (High mobility group) domains which are family of chromosomal proteins comparatively low molecular weight and non-histone mechanisms those bind with DNA among low sequence specific identity [56]. We also perceive that the individual SOX genes including with high mobility group A and B (DNA-binding domains) which mean HMG1-A also called HMG-T in fish and HMG2-B are related with other vertebrate genes, same observation in the report of structural features of HMG genes and proteins [57], which make double distinguishing features: two HMG boxes (A and B), homologous folded domains consist of approximately 80 amino acid residues, and a long acidic tail containing 20–30 aspartic or glutamic acid residues, all common carp SOX genes in HMG Group-I(SOX7) and Group-II (SOX18) have C-terminal domains. The SOX gene families of HMG BOX (PDOC00305) transcriptional factor encoded with sex determination factor on Y-chromosome play a vital role for embryonic development.In teleost, sex chromosomes are commonly monomorphic and feasibly evolutionary young [58]. With the help of bioinformatics analysis the SOX 18 combine with SOX7 and SOX17 or show C-termini among different higher vertebrates and teleost (Homo sapiens, Mus musculus, Canis lupus familiaris, Oreochromis niloticus, Carassius auratus, and Pan troglodytes), they have highly conserved aa residues and comprises with a strong trans activating domains [59], [60]. In Protein–protein interaction of SOX family are interact many other transcriptional factors like POU5F3, RUNX2A, NANOG, POU5F3, KLF4A, PAX6A, MYCA, CTNNB1, KLF4B, AND FOXD3 etc. All the SOX genes in common carp are mentioned in networks are directly or indirectly involved with sex determination mechanisms including Sry -related HMG box. This has previously been observed in other vertebrates including fish, genes are reported related to sex determination mechanisms including Sry -related HMG box (SOX genes), Anti-Mullerian hormone receptor type 2, Double-sex and mab-3 related transcription factors and Growth differentiation factor 6. In the same contrast SOX proteins showing strong interaction network cluster with each other such as network-1 consistent of (SOX19a, SOX11a and SOX3) same as earlier researcher reported in Zebrafish [61], network-II consistent of (SOX21a, SOX21b and SOX2) same as compared in Japanese rice fish [20], and in network 3 of common carp were including (SOX13, SOX6, SOX19b, SOX5, SOX32, and SOX10 proteins), as compared with network-III of common carp and same trend observe in study of SOX proteins for cell regulation and interaction in fish, SOX proteins make network of SOX13, SOX6, SOX5, SOX32, and SOX10 [26]. In Common carp for GO analysis SOX gene Group-I (SOX1, SOX2, SOX3, and SOX19) involved in positive regulation of DNA binding transcriptional factor activation, positive regulation of molecular function and involved in metabolic processes with different percentage given in results part or pathway relation shown in Fig. 4. The SOX1, SOX2, and SOX3 proteins sequence bind specifically with DNA via HMG domain and act as a transcription factor or involves in protein-protein interactions such as POU proteins binding with other proteins [29]. The SOX1 and SOX3 has a C-terminal region, which facilitate it to act as a transcriptional factor by developing protein-protein complex and sox1 expressed in development process for formation of neural plate [30]. The expression of Sox2 and SOX3 help in maintaining the identity of progenitor cell by inhibiting neurogenesis [29]. The Group-II (SOX5, SOX6, and SOX13) in common carp via gene ontology involved in negative regulation of transcription (DNA templated), gene expression and macromolecule metabolic process. The sox5 and sox6 have been identified in human, mouse and other organisms. Sox5 is expressed for developing CFu neurons and prevents premature differentiation [62], in mammals and fish sox6 act as transcription factor to controls the identity of skeletal muscle fiber [31]. The SOX13 expressed in different embryonic lineages and show development of central nervous system (CNS), visceral mesoderm of yolk sac, pancreas, kidney and liver [63]. In Common Carp Group-III (SOX8, SOX9, SOX10, and SOX21) involved in negative and positive regulation in transcription by RNA polymerase II. The Sox8, sox9 and sox 10 involved in many developmental processes like testis development maintenance of male fertility, humans early developing sate of gonads, expression in somatic cells and sex determination with the help of DMRT1 gene [32], [33]. Sox9 is an essential transcription regulatory factor for the development of adult cartilage and activates the genes transcriptional factor for structural components [34]. In the regulatory network of oligodendrocytes and schwann cells sox10 HMG domain show complex relation with sox8 and sox9, and include as important transcription factors for lineage progression, terminal differentiation and myelin induction [64]. In present work the chromosomal organization and synteny analysis of common carp between Zebrafish, and Human indicate that, all SOX a gene are located on different chromosomes and apparently does not fallow the unique pattern. The maximum linkage of chromosome is unknown in common carp and highest number of SOX genes (2) were found on unplaced region matched with many other chromosomes mentioned in zebrafish and human as like maximum (3) genes located on human chromosomes number 11, 12 and 22. Same pattern earlier reported in zebrafish on chromosome number 3, 12 and 22 [65]. The previously reported SOX2 gene was localized at 44 chromosome (2n) genome of Tilapiine species as like Nile tilapia, Mozambique tilapia, Oreochromis aureus and other teleost Haplochromis obliquidens, Kennyi cichlid, Scrapermouth mbuna, Pseudetroplus maculatus, which is analogous with current observation in common carp [65]. In our genome wide analysis the SOX3 gene is located on X chromosomes of Human, Dog and mouse which is more closely related with common carp 34 chromosomes sequence then to other teleost on the basis of HMG box region which are highly comparable earlier reports [66], [67]. The previous study also reported that SOX3 is X liked gene in multiple organisms [68], including human [69], mouse [70], dog, frog and some other fishes [71]. The numerous gene clusters of common carp is found on unknown region as same compared with recent study on genome wide identification of SOX genes in Nile tilapia [23], and Paralichthys olivaceus [72]. For evolution history of SOX gene family, phylogenetic analyses was done among higher vertebrates and teleost’s class of Actinopterygii, and detect all the genes of SOX families have been identified and clustered in respected clades such as Common carp shown closed phylogenic relation between Zebrafish and Goldfish. The same trend of phylogenic analysis of SOX gene family is observed in recent studies [53], [46]. The diversity of SOX protein sequences in common carp is consistent with other number of identified SOX genes in fishies such as Oryzias latipes, Nile tilapia, zebrafish, Goldfish, Nile tilapia, and channel catfish [23], [42]. The mammalian class (Human, Dog, mouse and Chimpanzee) SOX genes (12, 15, 16, 20) group have not been found in common carp genome which represented by single genes such as SOX15 and SRY. In our syntenic analysis SOX3 consider to link with mammalian Y‐chromosome, in previous research SRY evolve from the X‐linked gene SOX3 during the genesis [73]. The possible elucidation could be the wide-ranging soxgenes identified in diverse kind of fish genomes and other species [74]. The large number of novel genes could be derived from non‐conserved areas among the common carp and other teleosts [75], which unsuccessful to be annotate with functional informative proteins.

Conclusion

In conclusion, we identified expanded set of 27 SOX genes of Common carp. The total of 27 SOX genes can be divided SOX1A, SOX1B, SOX2, SOX3, SOX4A, SOX4B, SOX5, SOX6, SOX7, SOX8A, SOX8B, SOX9, SOX9A, SOX9B, SOX10, SOX10A, SOX10B, SOX11A, SOX11B, SOX13, SOX14, SOX17B, SOX18, SOX19A, SOX19B, SOX21A AND SOX21B. All SOX genes have conserved HMG domains in all higher vertebrates and teleosts. SOX gene in chromosomal mapping located on diverse number of chromosomes and mostly were found on unplaced regions. The current evidence will be supportive for further understanding of structural and functional properties of SOX gene family in fishes.

Funding’s

This work did not receive any fund form funding agencies in the public and commercial domains.

Declaration of Competing Interest

The authors report no declarations of interest.
  73 in total

Review 1.  Ecology meets endocrinology: environmental sex determination in fishes.

Authors:  John Godwin; J Adam Luckenbach; Russell J Borski
Journal:  Evol Dev       Date:  2003 Jan-Feb       Impact factor: 1.930

Review 2.  Structural features of the HMG chromosomal proteins and their genes.

Authors:  M Bustin; D A Lehn; D Landsman
Journal:  Biochim Biophys Acta       Date:  1990-07-30

3.  The combination of SOX5, SOX6, and SOX9 (the SOX trio) provides signals sufficient for induction of permanent cartilage.

Authors:  Toshiyuki Ikeda; Satoru Kamekura; Akihiko Mabuchi; Ikuyo Kou; Shoji Seki; Tsuyoshi Takato; Kozo Nakamura; Hiroshi Kawaguchi; Shiro Ikegawa; Ung-il Chung
Journal:  Arthritis Rheum       Date:  2004-11

Review 4.  Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity.

Authors:  Michael Freeling; Brian C Thomas
Journal:  Genome Res       Date:  2006-07       Impact factor: 9.043

Review 5.  Sry: the master switch in mammalian sex determination.

Authors:  Kenichi Kashimada; Peter Koopman
Journal:  Development       Date:  2010-12       Impact factor: 6.868

6.  Identification, molecular characterization and gene expression analysis of sox1a and sox1b genes in Japanese flounder, Paralichthys olivaceus.

Authors:  Jinning Gao; Wei Zhang; Peizhen Li; Jinxiang Liu; Huayu Song; Xubo Wang; Quanqi Zhang
Journal:  Gene       Date:  2015-08-08       Impact factor: 3.688

7.  X-linked sex-determining region Y box 3 (SOX3) gene mutations are uncommon in men with idiopathic oligoazoospermic infertility.

Authors:  Gerald Raverot; Herve Lejeune; Tom Kotlar; Michel Pugeat; J Larry Jameson
Journal:  J Clin Endocrinol Metab       Date:  2004-08       Impact factor: 5.958

8.  A gene from the human sex-determining region encodes a protein with homology to a conserved DNA-binding motif.

Authors:  A H Sinclair; P Berta; M S Palmer; J R Hawkins; B L Griffiths; M J Smith; J W Foster; A M Frischauf; R Lovell-Badge; P N Goodfellow
Journal:  Nature       Date:  1990-07-19       Impact factor: 49.962

9.  The Drosophila SOX-domain protein Dichaete is required for the development of the central nervous system midline.

Authors:  N S Soriano; S Russell
Journal:  Development       Date:  1998-10       Impact factor: 6.868

10.  The role of Sox6 in zebrafish muscle fiber type specification.

Authors:  Harriet E Jackson; Yosuke Ono; Xingang Wang; Stone Elworthy; Vincent T Cunliffe; Philip W Ingham
Journal:  Skelet Muscle       Date:  2015-01-27       Impact factor: 4.912

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.