Literature DB >> 32546936

Genomic Survey of Tyrosine Kinases Repertoire in Electrophorus electricus With an Emphasis on Evolutionary Conservation and Diversification.

Ling Li1, Dangyun Liu2, Ake Liu3, Jingquan Li1, Hui Wang1, Jingqi Zhou1.   

Abstract

Tyrosine kinases (TKs) play key roles in the regulation of multicellularity in organisms and involved primarily in cell growth, differentiation, and cell-to-cell communication. Genome-wide characterization of TKs has been conducted in many metazoans; however, systematic information regarding this superfamily in Electrophorus electricus (electric eel) is still lacking. In this study, we identified 114 TK genes in the E electricus genome and investigated their evolution, molecular features, and domain architecture using phylogenetic profiling to gain a better understanding of their similarities and specificity. Our results suggested that the electric eel TK (EeTK) repertoire was shaped by whole-genome duplications (WGDs) and tandem duplication events. Compared with other vertebrate TKs, gene members in Jak, Src, and EGFR subfamily duplicated specifically, but with members lost in Eph, Axl, and Ack subfamily in electric eel. We also conducted an exhaustive survey of TK genes in genomic databases, identifying 1674 TK proteins in 31 representative species covering all the main metazoan lineages. Extensive evolutionary analysis indicated that TK repertoire in vertebrates tended to be remarkably conserved, but the gene members in each subfamily were very variable. Comparative expression profile analysis showed that electric organ tissues and muscle shared a similar pattern with specific highly expressed TKs (ie, epha7, musk, jak1, and pdgfra), suggesting that regulation of TKs might play an important role in specifying an electric organ identity from its muscle precursor. We further identified TK genes exhibiting tissue-specific expression patterns, indicating that members in TKs participated in subfunctionalization representing an evolutionary divergence required for the performance of different tissues. This work generates valuable information for further gene function analysis and identifying candidate TK genes reflecting their unique tissue-function specializations in electric eel.
© The Author(s) 2020.

Entities:  

Keywords:  Electrophorus electricus; expression pattern; metazoans; phylogenetic analysis; tyrosine kinase

Year:  2020        PMID: 32546936      PMCID: PMC7249569          DOI: 10.1177/1176934320922519

Source DB:  PubMed          Journal:  Evol Bioinform Online        ISSN: 1176-9343            Impact factor:   1.625


Introduction

Tyrosine kinases (TKs) are a large and diverse superfamily of enzymes that catalyze the phosphorylation of select tyrosine residues in target proteins using ATP.[1] TKs are important mediators of signal transduction processes that regulate cell proliferation, differentiation, migration, metabolism, and programmed cell death.[2,3] According to whether their protein sequences contain transmembrane domains, TKs are further classified into 2 major families, receptor TKs (RTKs) and nonreceptor or cytoplasmic TKs (CTKs). The typical structural organization of RTKs typically includes a multidomain extracellular receptor that conveys ligand specificity, a transmembrane hydrophobic helix and a cytoplasmic portion containing a kinase domain (KD).[4] Most CTKs are associated with phosphotyrosine (pTyr) binding within cells and are likely to transmit the pTyr signals initiated by receptors, while RTKs are involved primarily in responding to extracellular ligands by phosphorylating intracellular target proteins to initiate signal transduction cascades.[5,6] At present, TKs have become the subject of an increasing number of studies in physiology and pathology, including inflammation, autoimmunity, neurodegeneration, and infectious diseases. Similar to many other gene families, the TK family underwent differential expansion in the history of metazoan evolution. After years of research, TKs have been characterized in many species, including many metazoans and a few premetazoans, and these findings have provided numerous vital insights into the structure, function, and regulation of TKs.[7] Phylogenetic analyses have shown that the RTKs underwent extensive diversification in each of the filasterean, choanoflagellate, and metazoan clades.[8,9] For instance, the VEGFR and Ephrin receptor subfamilies expanded through single-gene duplications in jawed vertebrates, and specific expansions of the Eph and EGFR subfamilies occurred between humans and zebrafish.[10] Although the highly conserved TK domain is present in all TK proteins, these proteins also contain a diverse arrangement of sequence domains involved in interactions with other molecules that allow for different signal transduction mechanisms.[1] Such divergent architectures are thought to result from gene duplication and domain shuffling events.[11] Based on sequence similarity and secondary domain architecture, TKs are further primarily divided into 30 subfamilies that are composed of CTKs and RTKs with specific functions in most metazoans. It has been assumed that 2 major episodes of expansions occurring in TK family. The initial diversification occurred before poriferans and the other metazoans diverged. The other expansion occurred around the diverged time of the cyclostomes and gnathostomes. The diversity of TKs may imply the complexity of the pTyr-based signaling system in metazoans.[12,13] The evolution of TK activity, its regulation in metazoans, and its involvement in the evolution of multicellularity have been the subject of intriguing studies,[14,15] and such research is facilitated by the completion of genome sequencing projects for a variety of different animals. The electric eel (Electrophorus electricus) is a freshwater fish native to South America that is best known for its ability to produce high-voltage electric discharges that are used for communication, navigation, and even predation and defense.[16] The taxonomic diversity of fishes that generate electricity is particularly interesting that electric organs have evolved not only in multiple independent fish lineages from myogenic precursors; even within this lineage, there is a tremendous amount of variation in their function.[17] One of the challenges in understanding protein function evolution involves the identification of a tractable model system that allows for an assessment of the core assumptions. The electric fish is the only known species that has evolved 3 distinct electric organs (EO) and provides a unique case study opportunity. Gene family construction is a widely used approach to underlying assumptions that are used to characterize the evolutionary process in both systematics and functional biology.[18] Despite extensive studies of TK genes on many other species, little is known about these supergene families in E electricus. Understanding the regulation of TK genes will be valuable for efforts to induce the differentiation of electrogenic cells in other tissues and organisms and to control the intrinsic electric behaviors of these cells. Recently, the availability of the complete electric fish genome sequence provides an opportunity to perform a genome-wide analysis of TK gene family.[16,19] In this study, we analyzed the whole genome sequence of E electricus and systematically identified putative full complement of TK genes (EeTKs). Through comprehensive phylogenetic approaches together with a comparison of the orthology, protein domain organizations, as well as analyzed their expression profiles in different tissues, we have elucidated in detail the evolution of TK expansion and diversity. This article provides the first comprehensive resource of electric eel TKs. The results of this study reveal commonalities and differences for the EeTKs among vertebrates and provide valuable information of EeTK diversity that we hypothesize underlie the functional differences of specific organs, for example, in the production of voltages. Our findings may also provide further insights into metazoan TK genes evolution and would facilitate addressing the physiological and developmental function of electric eel TKs.

Materials and Methods

Retrieval of TK sequences

To perform genome-wide identification and obtain sequences of the PTK gene family in the electric eel, published genome sequences of E electricus were first downloaded from the Ensembl database (https://useast.ensembl.org/Electrophorus_electricus/Info/Index), and redundant sequences were deleted using an in-house Perl script. The Pfam database (http://pfam.xfam.org/)[20] was used to screen the genome of E electricus. Proteins with Pkinase_Tyr domains (PF07714) were used to identify the putative TK proteins in the E electricus using the hidden Markov model (HMM) method. For the HMM search (3.1b2),[21] a coverage of 0.3 was used as the cutoff, and an E-value of 10-10 was used for alignments longer than 150 amino acids. The online tools SMART (http://smart.emblheidelberg.de/)[22] and CDD (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) were used to verify the TK domains of the predicted proteins. The human and zebrafish TK proteins were retrieved from previous publication data.[10,23,24] We also selected an additional 30 representative metazoan species to identify their TK members, and the information for these species is listed in Table S1. The TK sequences of these species were obtained as described above for those of electric eel.

Phylogenetic analysis of TK proteins and gene nomenclature

The identified TK proteins were aligned using T-Coffee with the default settings.[25] Phylogenetic analyses of all identified TK domains were performed using the neighbor-joining (NJ) and maximum likelihood (ML) methods. The NJ and ML trees were constructed using the program MEGA 6.0[26] with the gaps/missing data parameter set to “pairwise deletion” to account for variable amino acid sites and topology support estimated using a 1000-replicate bootstrap test. We used ProtTest 3.4 (https://github.com/ddarriba/prottest3) to assess the best amino acid substitution model to infer the phylogeny.[27] PhyML 3.1[28] was used for ML reconstruction under the LG model (LG model is a method of amino acid replacement matrix, which called after the two authors [Le SQ and Gascuel O]). Invariable sites and the γ-parameter were set to the values generated by ProtTest 3.4. Statistical support for the resulting topology was determined by a 1000-replicate bootstrap analysis. All phylogenetic trees were calculated based on the alignment of common TK domains. Genes that had been previously reported were named accordingly, and others were named according to their homologies to those of zebrafish and human PTK subfamily genes in the phylogenetic tree and were tagged with a species abbreviation.

Protein architecture, conserved motif prediction, and synteny analysis

Protein domain organization was analyzed using HMMER software by searching the SMART (http://smart.embl-heidelberg.de) and Pfam (http://pfam.sanger.ac.uk/) databases. We confirmed the presence of these predicted domains by manually inspecting the alignments with HMMER. The conserved motifs in the PTK protein sequences were first identified using the online software MEME (Multiple Expectation-Maximization for Motif Elicitation, http://meme-suite.org)[29] with the following parameters: the optimum motif width was set from 6 to 200, and the maximum number of motifs was 15. The discovered motifs were then annotated using the Pfam program. By checking their physical locations on individual chromosomes, tandem duplications were identified as 2 TK genes when they were separated by no more than 1 intervening gene. For syntenic analysis of the EeTK genes, MCScanX[30] with the default settings was used to identify syntenic gene pairs in the E electricus genome.

Gene ontology annotation

The gene ontology (GO) terms associated with the genes were obtained from the E electricus genome databases (https://efishgenomics.integrativebiology.msu.edu/), and WEGO was used to perform GO functional classification and identify the distribution of gene functions in the electric eel at the macro level. To determine the function of the unigenes, BLASTx alignments with an E-value ⩽ 10-5 were performed with different databases, including Eukaryotic Orthologous Groups (KOG, http://www.ncbi.nlm.nih.gov/KOG/), the NCBI nonredundant protein database (nr, http://www.ncbi.nlm.nih.gov/), GO (http://www.geneontology.org/), and Swiss-Prot (http://www.expasy.ch/sprot).

Transcriptomic data of TK genes from different tissues of various animals

Publicly available transcriptomic data from 7 tissues (brain, spinal cord, heart, skeletal muscle, Sachselectric organ, primary electric organ, and kidney tissues) of E electricus from Gallant et al[16] were used. High-throughput RNA sequencing raw data were downloaded from the SRA database (http://www.ncbi.nlm.nih.gov/sra). Quality control of the raw reads was performed with FastQC v0.11.5. After clipping of the Illumina adapter sequences and trimming low-quality bases using fastx-toolkit, short reads were individually mapped to their respective transcriptome assemblies using Bowtie2 (v2.2.8) with default parameters.[31] To estimate gene expression levels, clean reads were mapped to nonredundant unigenes to calculate the fragments per kilobase transcript length per million fragments mapped (FPKM) value using RNA-Seq by Expectation-Maximization (RSEM).[32] These expression values were normalized for sequencing depth and transcript length and then scaled via the trimmed mean of M values normalization under the assumption that most transcripts are not differentially expressed. Generally, genes with an expression of FPKM < 1 were regarded as not expressed.[33] We obtained the human TK expression values for the abovementioned tissues from the Genotype-Tissue Expression (GTEx) project[34] (the electric organ was excluded) and selected the median value as the expression value in a given tissue for each gene. To assess the tissue-specific expression of genes, in this study, we used the parameter tau (τ)[35] to determine whether a given gene was expressed in a tissue-specific manner. The τ value was calculated as follows for each gene among the different tissues according to Yanai et al[36] where x is the expression of the gene in tissue i, and n is the number of tissues. The reads per kilo base per million mapped reads (RPKM) values were normalized by performing a log2 transformation after adding 1 to avoid negative values. The τ values ranged from 0 to 1, which indicate genes ranging from broadly to specifically expressed.

Results

Identification of the TK repertoire of E electricus

For the genome-wide identification of EeTKs, we initially identified the proteins using HMMER[21] and SMART[22] as described in the Materials and Methods section and ensured the integrity of the data using BLAST searches.[36] Our computational analyses led to the identification of 114 distinct TK genes in the E electricus genome, with 70 RTKs organized into 20 subfamilies and 44 CTKs organized into 10 subfamilies. Five of these 30 subfamilies have single members (CCK4, Musk, Sev, Ryk, and Ret subfamilies of RTKs), while the other subfamilies have multiple members, with the Src and Eph subfamilies having the greatest numbers of members (16 and 17, respectively). The detailed information regarding these genes, including the gene name, scaffold location, and the longest transcript ID, is summarized in Table S1.

Phylogenetic position of EeTKs is consistent with the conservation of core motifs

To elucidate the origin of EeTKs, we first conducted a preliminary phylogenetic analysis of the EeTK superfamily. We observed that neither the CTKs nor the RTKs grouped into clearly distinct monophyletic clusters (Figure 1), indicating that CTKs and RTKs were not derived from single genes.[37] In principle, conserved protein sites typically correspond to functional significance. To analyze the organization of motifs in EeTK proteins, 8 distinct motifs in the TK domain were identified with high E-values using the MEME tool. A schematic distribution of these motifs is shown in Figure 1. Most TK members within the same clade, especially the most closely related members, typically share common motif compositions (eg, Eph and Src), indicating potential functional similarities among TK proteins. Among them, all groups of the RTK and CTK protein subfamilies contain motifs 1, 7, and 8, in which the highly conserved residues constitute the catalytic loop. In addition, motifs 4, 5, and 7 contain the conserved glycine (G) or glutamic acid (E) residues required for ATP binding, and motifs 2, 5, 7, and 8 contain conserved glycine (G) or proline (P) residues required for activity (Figure S1). It was found that motifs 2, 3, 4, 5, 7, and 8 in Lmr subfamily are conservative, while only motif 3 and 8 are conservative in Axl subfamily. In the JakB (Janus kinase B) subfamily, each member has the same motifs, with some members having additional motifs, and the motif architecture differs among these proteins. These results reveal that the phylogenetic position of EeTKs is consistent with the conservation of core motifs, while the structures of gene members were much more complicated.
Figure 1.

Phylogeny and motif architecture of electric eel TK genes. The left panel shows a maximum likelihood (ML) tree with 1000 bootstraps rooted by sea squirt atk gene. For simplicity, the bootstrap support percentages are plotted as circle marks on the branch (only higher than 50% are indicated), and circle size is proportional to the bootstrap values. The right panel shows the motif architecture for each EeTK gene identified using the MEME suite according to the protein sequence. EeTK indicates electric eel TK; MEME, Multiple Expectation-Maximization for Motif Elicitation; ML, maximum likelihood; TK, tyrosine kinases.

Phylogeny and motif architecture of electric eel TK genes. The left panel shows a maximum likelihood (ML) tree with 1000 bootstraps rooted by sea squirt atk gene. For simplicity, the bootstrap support percentages are plotted as circle marks on the branch (only higher than 50% are indicated), and circle size is proportional to the bootstrap values. The right panel shows the motif architecture for each EeTK gene identified using the MEME suite according to the protein sequence. EeTK indicates electric eel TK; MEME, Multiple Expectation-Maximization for Motif Elicitation; ML, maximum likelihood; TK, tyrosine kinases.

Functional conservation and diversification of EeTKs

To better understand the functional diversity of EeTKs, we conducted a domain architecture survey for TK-encoding genes (Figure S2). Of the 114 putative TK-encoding proteins identified, 70 RTKs contain an intracellular KD, a transmembrane (TM) segment, a signal peptide, and known protein domains or motifs in the extracellular region. The other 44 EeTKs lack a signal peptide and a TM and are thus classified as CTKs (Figure 2). The CTK repertoire shared the domain architecture with SH2 or SH3, which can mediate interprotein interactions and promote n-Src catalytic activity, and thus influence signal transduction.[38] The Jak subfamily has dual TK domains, it grouped into 2 clusters, designated JakA and JakB, and this result was consistent with the report that 1 domain is inactive and may regulate the catalytic activity and autophosphorylation of the other domain, which is active.[39] RTKs show an extensive divergence in their architectures, containing 19 subfamilies with distinct organizations of protein domains. The most common domains are fibronectin type III (FN3) and immunoglobulin (IG) domains, which are extracellular domains involved in interactions with extracellular ligands or other receptors.[40] Members of the Eph, InsR, Axl, and Tie subfamilies typically have a single KD and 2 or 3 FN3 repeats. The PDGFR, VEGFR, and FGFR subfamilies are very similar in structure, with 5 or 7 IG domains characterizing the extracellular portion of the proteins.[38] In addition, the Tie, Axl, Ror, Musk, Trk, and CCK4 subfamilies also possess an IG domain.
Figure 2.

Classification of EeTKs. One hundred fourteen EeTKs were divided into 10 cytoplasmic tyrosine kinase (CTK) and 20 receptor tyrosine kinase (RTK) subfamilies, respectively, by domain architecture. A typical domain organization is schematically displayed for each subfamily, with the number of the genes that belong to each domain architecture shown in parentheses. The Pfam or SMART domain names are shown on the bottom. The domain organizations of all the proteins are shown in Figure S1. CTK indicates cytoplasmic tyrosine kinase; EeTK, electric eel TK; RTK, receptor tyrosine kinase.

Classification of EeTKs. One hundred fourteen EeTKs were divided into 10 cytoplasmic tyrosine kinase (CTK) and 20 receptor tyrosine kinase (RTK) subfamilies, respectively, by domain architecture. A typical domain organization is schematically displayed for each subfamily, with the number of the genes that belong to each domain architecture shown in parentheses. The Pfam or SMART domain names are shown on the bottom. The domain organizations of all the proteins are shown in Figure S1. CTK indicates cytoplasmic tyrosine kinase; EeTK, electric eel TK; RTK, receptor tyrosine kinase. “General” kinase genes are associated with many diverse biological processes, suggesting that different subfamily of kinases may be involved in distinct functions. To study the functions of EeTKs, gene ontology (GO) annotations were obtained from the Ensembl genome databases to construct GO graphs. As shown in Figure S3, almost all of the EePKs (97.5% for CTKs, 100% for RTKs) are involved in catalytic activity, binding, cellular process, and metabolic process, consistent with their primary functions. The results also showed that the majority of CTKs are involved in diverse biological regulatory functions (52.5%) and predominantly participate in responses to stimuli (50%), while RTKs tend to be related to membrane part (64.3%), molecular transducer activity (71.4%), and biological regulation (75.7%). Compared with the wide range of genes with GO annotations, a high proportion of the EeTK genes play roles in catalytic activity, binding, signal transducer activity, metabolic process, localization, signaling, and so on.

Whole-genome and tandem duplications contribute to EeTKs diversity

Gene duplications in genomes could provide important information for gene evolution analysis.[41,42] Ray-finned fishes are believed to have undergone a third genome duplication, the role of which in specific fish genomes has yet to be understood.[43] Thus, we assigned orthology to the EeTKs based on their zebrafish counterparts (Figure 3), and a list of electric eel TKs, their orthologues, and percent identities are shown in Table S2. The zebrafish TKs were identified by Challa and Chatti,[10] with the results highlighting the effects of zebrafish genome duplication events on TK evolution in teleosts. The orthology assignments show a clear clustering of EeTKs and zebrafish TKs into identical families, indicating that the duplication of EeTKs in the E electricus presumably resulted from a teleost-specific genome-wide duplication event (ie, ploidy).[44,45] The results revealed that each EeTK has a corresponding zebrafish orthologue, except for bmx, which does not have zebrafish orthologues. In addition, the zebrafish genes ptk2ab, jak1, jak2b, src, yes1, lyn, and erbb4 have lineage-specific duplicated copies in E electricus, whereas met, epha10, fgfr1b, ptk2aa, ptk6a, axlb, and tnk2a do not have electric eel orthologues. Therefore, the TK composition of the electric eel, as a bony fish, is similar to that of zebrafish but also has its own specific features. We further investigated the number of EeTK orthologues present in the human genome, which encodes 90 TKs, with 32 CTKs and 58 RTKs.[23] Of the 50 human TKs that have single electric eel orthologues, whereas of the 30 human TKs with multiple electric eel orthologues, only 9 are CTKs and 21 are RTKs. The 9 human CTKs are represented by 19 electric eel orthologues, whereas the 21 human RTKs are represented by 44 electric eel orthologues (Figure S4). Such an expansion of TKs is in accordance with genome duplication events that occurred during the teleost radiation. Despite the difference in the actual number of genes, we observed that every human TK subfamily is represented in the teleost, suggesting conserved TK evolution in vertebrates. In addition, based on genome localization and MCScanX analysis results, we found 7 of the 114 EePTKs were tandemly duplicated genes, namely, erbb4, erbb4c, kdr, pdgfra, pdgfrb, kita, and csf1ra (Table S3, Figure S5), which are clustered into 3 tandem duplication event regions on the E electricus genome scaffolds. Furthermore, 3 pairs of segmental duplication events were detected in our analysis, namely, erbb4 and erbb4c, pdgfra and pdgfrb, and kita and kdr. The Erbb kinases have ever been reported as specified electric organ proteins[46]; here, we also identified the other 2 closely tandem arranged EeTK genes, which provided potential target genes for unique tissue-function specialization in electric eel. The diversified and specified gene functions in each gene family may be the result of gene expansion from ancient paralogs or multiple origins of gene ancestry. Our results demonstrate that both whole-genome and tandem duplication events have been fundamental in shaping the diversity of TK repertoire in E electricus genome.
Figure 3.

Dendrogram representing orthologous relationships between electric eel and zebrafish TK proteins. Phylogenetic analysis was conducted using the maximum likelihood (ML) method based on an alignment of TK domains from the electric eel and zebrafish, with a TK gene of Ciona intestinalis used as the outgroup. Only the subfamily names are shown. The electric eel genes are prefixed with “Ee,” and the zebrafish genes are prefixed with “Dr.” ML indicates maximum likelihood; TK, tyrosine kinases.

Dendrogram representing orthologous relationships between electric eel and zebrafish TK proteins. Phylogenetic analysis was conducted using the maximum likelihood (ML) method based on an alignment of TK domains from the electric eel and zebrafish, with a TK gene of Ciona intestinalis used as the outgroup. Only the subfamily names are shown. The electric eel genes are prefixed with “Ee,” and the zebrafish genes are prefixed with “Dr.” ML indicates maximum likelihood; TK, tyrosine kinases.

Lineage-specific expansion but conservation of TKs in vertebrates

To better understand the phylogenetic relationships in this multigene family during vertebrates evolution, we used the NJ and ML algorithms to analyze EeTKs using sequences from 5 other representative chordates, including amphioxus (Branchiostoma lanceolatum), sea squirt (Ciona intestinalis), sea lamprey (Petromyzon marinus), elephant shark (Callorhinchus milii), and zebrafish (Danio rerio) (Figure 4). To minimize phylogenetic artifacts, we inferred a simplified ML tree that excluded sequences with identical structural domain compositions in the same species, which were considered to be recent closely related paralogs. The resulting tree topology generally agreed with the original tree that used the complete data set (Figure 1). Due to genome duplication during the teleost radiation, all the EeTK genes clustered with duplicated paralogous genes from vertebrates to form monophyletic clades. All the TKs were analyzed in the amphioxus and elephant shark genomes, and at least 1 TK member was identified among all of the 30 major subfamilies. This finding strengthened the notion that the basic repertoire of the CTK and RTK families had already been established before the divergence of urochordates and vertebrates. After the divergence of protostomes and deuterostomes, the multiplicity of members in the same subtype rapidly increased by further gene duplication during the first half of chordate evolution before the fish-tetrapod split, giving rise to gene family expansion. Phylogenetic analyses revealed that TK repertoire expanded and was maintained in at least 1 major group of teleosts. We found gene members of Jak, Src, and EGFR subfamilies expanded specifically in electric eel, whereas Eph, FGFR, and Axl have more members in zebrafish. Teleosts have a considerably larger repertoire of TKs than tetrapods, which is in accordance with most recent large-scale changes being limited to simple duplications, including whole-genome duplications within vertebrates. Our results supported that the TK repertoire showed a conserved but expanded pattern that is markedly consistent with that observed in the electric eel and other species.
Figure 4.

Phylogenetic relationships of major tyrosine kinase subfamilies in 5 representative chordata. The ML tree was based on multiple sequence alignments of TK domains, with bootstrap values shown on the branches. Abbreviated species names are as follows: Bf, Branchiostoma lanceolatum; Ci, Ciona intestinalis; Pm, Petromyzon marinus; Cm, Callorhinchus milii; Ee, Electrophorus electricus; and Dr, Danio reri. The RTK subfamily is marked in blue, and the CTK subfamily is marked in red. CTK indicates cytoplasmic tyrosine kinase; ML, maximum likelihood; RTK, receptor tyrosine kinase; TK, tyrosine kinases.

Phylogenetic relationships of major tyrosine kinase subfamilies in 5 representative chordata. The ML tree was based on multiple sequence alignments of TK domains, with bootstrap values shown on the branches. Abbreviated species names are as follows: Bf, Branchiostoma lanceolatum; Ci, Ciona intestinalis; Pm, Petromyzon marinus; Cm, Callorhinchus milii; Ee, Electrophorus electricus; and Dr, Danio reri. The RTK subfamily is marked in blue, and the CTK subfamily is marked in red. CTK indicates cytoplasmic tyrosine kinase; ML, maximum likelihood; RTK, receptor tyrosine kinase; TK, tyrosine kinases.

Diversity and commonality of metazoan TK repertoires

Both the timing and the rapid expansion in the number of TK genes provide a glimpse into the mechanisms involved in promoting the coordinated emergence and increasing sophistication of signal transduction during eukaryotic evolution.[47] We further traced the origins and diversification of TK subfamilies in 31 genomes covering the major clades of metazoans. As shown in Figure 5, all of the TK members were placed into corresponding clusters, including 1674 TK genes among the 31 subfamilies (Table S4). Among the bilaterians, the number of subfamilies ranges from 13 to 30. The protostomes possess 13~23 subfamilies, and the deuterostomes possess 21~30 subfamilies. A similar number of represented subfamilies (ranging from 26 to 30) are present across the vertebrates, which indicates that the TK repertoire tends to be evolutionarily conserved. In addition, the total number of TKs across different lineages exhibited considerable variation (ranging from 32 in fruit fly to 110 in electric eel), indicating that clade-specific gene duplication and domain shuffling can increase the number of metazoan TK subfamilies with a premetazoan origin. We observed that 11 of the 30 TK subfamilies generally possess a single representative gene among protostomes, and a few subfamilies, including VEGFR, Met, and InsR, have undergone species-specific expansions. The Src subfamily was established before the emergence of protostomes, and this subfamily contains some members (src, yes1, fyn, and lyn) that expanded in the electric eel and may be associated with the electric organs reported in previous studies.[48,49] For the Tec, Tie, Ddr, and Eph subfamilies, lineage-specific duplications were detected in the nonvertebrate deuterostome genomes (in the sea squirt and/or the amphioxus). Subfamily genes ddr2l and Kdrl, which have distinct orthologues in the chicken and stickleback genomes, were not maintained in mammals and eutherians. The RTK genes epha1 and insrr or the CTK genes fgr and srm are present in mammals but not the electric eel. These results show that some TKs may have a higher propensity to become useless and may have been independently lost in different lineages. For the electric eel, the exceptional and independent diversification of the TK repertoire through the creation of TKs with diverse architectures may be associated with an increase in organismal diversity and tissue-function specializations.
Figure 5.

Schematic representation of the occurrence of TKs in 31 representative metazoan species. The square in different colors represents that the number of genes indicated at the bottom was observed in a species. The subfamily topology on the left shows the ML tree according to Figure S1. CTK indicates cytoplasmic tyrosine kinase; ML, maximum likelihood; RTK, receptor tyrosine kinase; TK, tyrosine kinases.

Schematic representation of the occurrence of TKs in 31 representative metazoan species. The square in different colors represents that the number of genes indicated at the bottom was observed in a species. The subfamily topology on the left shows the ML tree according to Figure S1. CTK indicates cytoplasmic tyrosine kinase; ML, maximum likelihood; RTK, receptor tyrosine kinase; TK, tyrosine kinases.

Comparative gene expression profiling of TKs across different tissues

The expression patterns of genes are typically related to their functions, and the results of RNA-Seq experiments have provided preliminary information regarding EeTK gene expression profiles. Therefore, we investigated the expression patterns of EeTK genes in different tissues via RNA-Seq analysis and retrieved expression data for corresponding human tissues from GTEx. Our results showed high variance in the expression levels of human and electric eel TK genes among different tissues (Figure 6). Considering the available RNA-Seq data for the electric eel, the brain showed the highest number of expressed TK genes (87 genes) among the tested tissues, while the kidney and electric organ presented the fewest (only 17, 28, and 25 genes for the kidney, primary electric organ [EO], and Sachs EO, respectively; Figure 6A). In humans, the lung presented the highest number of expressed TK genes (78 genes), while the skeletal muscle showed the fewest (30 genes). Only a few genes from humans and electric eels, including ddr1, txk, epha4b, ddr2b, fgfr4, axl, and fyna, presented no expression in a few or even all tissues, with most genes, especially fynb, syk, hck, and pdgfrb, exhibiting relatively high expression in all tissues (Figure 6A). These similar expression patterns may indicate that these genes play fundamental roles in regulating organism growth and tissue development. We also found a few genes presenting distinct expression patterns. For instance, epha8 showed low expression across almost all the tested tissues in the electric eel but high expression in human tissues. Ptk2bb was observed to be expressed at low levels in all of the electric eel tissues except in the brain, while this gene was highly expressed in all human tissues except muscle. In addition, we observed that electric organ tissues and muscle shared a similar pattern with specific highly expressed TKs (ie, epha7, musk, jak1, and pdgfra), suggesting regulation of TKs might play an important role in specifying an electric organ identity from its muscle precursor. The epha4 and epha6 were shown to be predominantly high expressed in electric organ and lower expression in muscle, but other members of Epha members expressed lowly in both electric organ and muscle. These results indicate that members in TKs subfamily participated in subfunctionalization representing an evolutionary divergence required for the performance of electric organs and muscle.
Figure 6.

The expression profiles of EeTK genes in different tissues. (A) The relative expression levels are shown corresponding to log2-transformed FPKM values after adding a pseudocount of 0.1. The scaled colors vary from blue showing high expression level to red showing low expression for genes according to FPKM values. (B) The comparison of tau values of TK genes between the electric eel and humans. EeTK indicates electric eel TK; FPKM, fragments per kilobase transcript length per million fragments mapped; TK, tyrosine kinases.

The expression profiles of EeTK genes in different tissues. (A) The relative expression levels are shown corresponding to log2-transformed FPKM values after adding a pseudocount of 0.1. The scaled colors vary from blue showing high expression level to red showing low expression for genes according to FPKM values. (B) The comparison of tau values of TK genes between the electric eel and humans. EeTK indicates electric eel TK; FPKM, fragments per kilobase transcript length per million fragments mapped; TK, tyrosine kinases. Since the τ parameter has been recently proposed as a regular method to measure expression specificity,[33] we used this value to determine whether a given gene is expressed in a tissue-specific manner. We refer to those genes with τ values greater than 0.8 as being genes exhibiting tissue-specific expression, such as ephb8, alk, and fgfrl1b (Figure 6B), and they were observed to be predominantly expressed in the electric eel brain and spinal tissues at high levels. Most genes (~71%) with τ values less than 0.4 were regarded as broadly expressed genes, including rbmx and jak2b (Figure 6B). Mst1rb, ntrk3b and ephb3a were shown to be expressed at low and similar levels in the electric organ. Epha8, alk, and musk exhibited a similar pattern between humans and electric eels. Expression-specific pattern may reflect their unique tissue-function specializations and provide insight into evolutionary conservation and divergence between duplicated gene expression, maintenance, and regulation.

Discussion

This study was conducted to create an unambiguous resource providing detailed information on EeTKs and their relationship to multiple metazoan TKs. The information we report is crucial for gaining a better understanding of the TK repertoire in an emerging model electric fish and for studies of TK biology in various model organisms. To our knowledge, no such study has previously been reported, and because the annotation of the electric eel genome is ongoing (Figure S6), our study generated systematic and useful annotation data for the EeTK genes.

Evolutionary relationships within the vertebrate TK repertoire

Comparing the EeTK genes with the TKs of human and zebrafish, the majority of their features appear to be largely stable, including subfamily numbers and sequence characteristics.[50,51] Most metazoan genomes encode 30 TK subfamilies, although the total number of TKs representing each subfamily of orthologues varies among different lineages (Figure 4, 90 in humans, 122 in zebrafish and 114 in the electric eel). Because the conservation of gene family size across multiple species may reflect specific functional constraints,[52,53] this observation is consistent with our TK identification results in the electric eel. The difference in the total number of TKs is due to the specific expansion of a few subfamilies, such as the Eph and Src subfamilies. Similarly, among the bilaterians, the differences in the total number of TK genes may also result from the specific expansions of specific subfamilies. In the present study, our genome-wide analysis of E electricus identified 114 TK-encoding genes belonging to 30 subfamilies, and sequence identity matrix analysis results suggested that TKs in vertebrates tend to be remarkably conserved and stable.[37]

Sequence architecture reveals functional conservation and diversification

TKs have significant roles in cell growth, apoptosis, and development. A simple way to assess functional use is to assess the extent to which a consistent order of domain architectures is maintained.[54] A comparison of the number of distinct domain combinations may more accurately reflect the diversity of functional usage of protein families. The evolutionary origin of the canonical/functional domain organization of each TK subfamily is indicated where a conserved function has been demonstrated. A highly conserved subfamily of TKs may be responsible for highly conserved signaling pathways regulating target gene expression, and the diversity of domain architecture within a subfamily (such as Eph and Src) can indicate the linkage of historically disparate functional domains into a novel molecule during the evolution of complex multicellular life.[55] Of the 10 common metazoan CTK subfamilies, at least 4 (Src, Csk, Tec, and Abl) are designated as Src-related CTKs for their shared SH3-SH2-TyrKc domain architecture, whereas in the RTK subfamilies, almost all the subfamilies evolved extracellular domains, which are probably used for sensing extracellular signals. The distantly related domain patterns of RTKs and CTKs reflect that the generation of these 2 types of TKs may have been due to different evolutionary conservation, environmental variability, and biological functions during the unicellular-multicellular transition.[56] The lineage-specific TK repertoire, which diversified independently of the common set of TKs in metazoans, may reflect the constant exposure of their tissues to the environment.[57] The more stable evolution of CTKs would then reflect a relatively stable intracellular environment, with CTKs generally acting downstream of RTKs or other receptors to transmit extracellular signals. The far fewer changes in domain architecture in more evolutionarily recent times may coincide with a more stable multicellular environment.

Distinct evolution of the vertebrate CTK and RTK repertoire after gene duplication events

Phylogenetic analysis suggested that expansion and sequence variation events have occurred in the TK family in vertebrates. The TK repertoire appears to be largely stable after the initial expansion, with a unique set of vertebrate TKs retained after the occurrence of whole-genome duplication.[58] The duplications of genes and entire genomes are believed to be important mechanisms underlying morphological variation and functional innovation in the evolution of life and especially for the wide diversity observed in the speciation of fishes.[44,59] The results of our analysis further show that gene duplication, which is known to be common in teleost due to the whole-genome duplications in the teleost lineage, occurs more in RTK genes than CTK genes (RTK vs CTK for multiple orthologs is 21 vs 9 and 44 vs 19 in human and electric eel, respectively; Figure S4). This finding is consistent with the theory that following genome duplication, gene duplicates that acquire novel functions and contribute to diversity are retained more frequently to contributors to the evolution of organisms. RTKs are longer and contain more domains with greater variation than CTKs, suggesting a greater probability of the evolution of novel combinations and functions that results in greater duplicate gene retention for RTKs. This explanation is also consistent with the recent discovery of a higher divergence among RTKs during metazoan evolution, which may have facilitated cell-to-cell communication and allowed for responses to a variety of extracellular cues during the evolution of multicellularity.[8] The same rationale would help explain the retention of a larger RTK than CTK repertoire in the electric eel. Our studies assessing these features in electric eel provide a unifying and consistent explanation with those observed in zebrafish, which aids in explaining the observed retention of a larger RTK than CTK repertoire in vertebrate lineages.

Expression pattern of TK genes in the electric eel

Expression profiling analyses revealed tissue-specific or sex-dimorphic expression patterns of EeTK genes. Our findings complement the systematic information available on the TK family in the electric eel and increase our understanding of metazoan TKs. The results showed that 90 TKs were expressed in at least 1 tested tissue, and there was high variance in the expression levels among different tissues. The expression pattern analysis of EeTK genes among 7 different tissues showed that 9 EeTK genes exhibited tissue-specific expression, while 12 EeTK genes were broadly expressed. The former group includes alk and ephb2 (Figure 6A), while the latter group includes ryk and jak1 (Figure 6B). Our results suggest that the variable expression patterns of genes exhibiting tissue-specific expression may indicate that they have different roles between species, and the similar patterns of broadly expressed genes may suggest that they have fundamental roles in organism development.[16,60] We observed that the tyk1 gene is highly expressed in the muscle and EO, which may indicate that this TK is involved in bioelectricity generation. It has been reported that when this gene is overexpressed, phosphatidylinositol 3’-kinase (PI3K) pathways can be activated to induce cell invasion and metastasis to distant organs.[61] PI3K acts through distinct signaling targets to regulate cell size, cell proliferation, and protein synthesis and degradation. These results are consistent with the previous finding that electrocytes, the electrical cells in EOs, are much larger than muscle fibers, which may be due to changes in insulin-like growth factor (IGF) signaling pathway genes.[8] Overall, the above results show variable expression of major TKs in different tissues and may indicate that each gene plays different roles during the organogenesis process. Considering that the electric eel has electric organs, making it unusual among teleost fish and that TKs are one of the components of the P-Tyr cell signaling pathway in metazoans, we considered that TKs may be related to the specialization of electric organ discharge. Although the functions of most TK genes in the electric eel remain to be examined, our phylogenetic and expression analyses provide a solid foundation for future research.[10,62] Follow-up functional studies are required for a better understanding of the roles of TKs in the regulation of key growth and developmental processes.

Conclusions

In this study, we systematically identified putative full complement of TK genes by analyzing the E electricus genome sequences and characterized their sequences by phylogenetic analysis among representative metazoans, as well as analyzed their expression profiles in different tissues. Understanding the evolution of and regulation of TK activity across vertebrates and their relationships to the evolution of multicellular organisms has been a crucial subject in recent studies. It should be noted that the currently annotated version of the E electricus genome sequence contains unmapped and partial scaffolds. As the genome project continues, the quality of the sequence information will improve, and the gene annotations will be more informative. While we believe that our study includes all of the EeTKs, new and refined sequence information may result in modifications to our computational findings with functionally relevant annotations. Click here for additional data file. Supplemental material, Supplementary_figures_v2_xyz3635604710e57 for Genomic Survey of Tyrosine Kinases Repertoire in Electrophorus electricus With an Emphasis on Evolutionary Conservation and Diversification by Ling Li, Dangyun Liu, Ake Liu, Jingquan Li, Hui Wang and Jingqi Zhou in Evolutionary Bioinformatics Click here for additional data file. Supplemental material, TableS1_xyz3635682ba2e4d for Genomic Survey of Tyrosine Kinases Repertoire in Electrophorus electricus With an Emphasis on Evolutionary Conservation and Diversification by Ling Li, Dangyun Liu, Ake Liu, Jingquan Li, Hui Wang and Jingqi Zhou in Evolutionary Bioinformatics Click here for additional data file. Supplemental material, TableS2_xyz363560f474193 for Genomic Survey of Tyrosine Kinases Repertoire in Electrophorus electricus With an Emphasis on Evolutionary Conservation and Diversification by Ling Li, Dangyun Liu, Ake Liu, Jingquan Li, Hui Wang and Jingqi Zhou in Evolutionary Bioinformatics Click here for additional data file. Supplemental material, TableS3_xyz363562d55c6df for Genomic Survey of Tyrosine Kinases Repertoire in Electrophorus electricus With an Emphasis on Evolutionary Conservation and Diversification by Ling Li, Dangyun Liu, Ake Liu, Jingquan Li, Hui Wang and Jingqi Zhou in Evolutionary Bioinformatics Click here for additional data file. Supplemental material, TableS4_xyz363568fdd009b for Genomic Survey of Tyrosine Kinases Repertoire in Electrophorus electricus With an Emphasis on Evolutionary Conservation and Diversification by Ling Li, Dangyun Liu, Ake Liu, Jingquan Li, Hui Wang and Jingqi Zhou in Evolutionary Bioinformatics
  61 in total

Review 1.  The protein kinase complement of the human genome.

Authors:  G Manning; D B Whyte; R Martinez; T Hunter; S Sudarsanam
Journal:  Science       Date:  2002-12-06       Impact factor: 47.728

2.  The "fish-specific" Hox cluster duplication is coincident with the origin of teleosts.

Authors:  Karen D Crow; Peter F Stadler; Vincent J Lynch; Chris Amemiya; Günter P Wagner
Journal:  Mol Biol Evol       Date:  2005-09-14       Impact factor: 16.240

3.  Effective estimation of the minimum number of amino acid residues required for functional divergence between duplicate genes.

Authors:  Jingqi Zhou; Dangyun Liu; Zhining Sa; Wei Huang; Yangyun Zou; Xun Gu
Journal:  Mol Phylogenet Evol       Date:  2017-05-12       Impact factor: 4.286

Review 4.  Whole-genome duplication in teleost fishes and its evolutionary consequences.

Authors:  Stella M K Glasauer; Stephan C F Neuhauss
Journal:  Mol Genet Genomics       Date:  2014-08-05       Impact factor: 3.291

Review 5.  Cell signaling by receptor tyrosine kinases.

Authors:  Mark A Lemmon; Joseph Schlessinger
Journal:  Cell       Date:  2010-06-25       Impact factor: 41.582

6.  Signaling properties of a non-metazoan Src kinase and the evolutionary history of Src negative regulation.

Authors:  Wanqing Li; Susan L Young; Nicole King; W Todd Miller
Journal:  J Biol Chem       Date:  2008-04-04       Impact factor: 5.157

7.  Unique patterns of transcript and miRNA expression in the South American strong voltage electric eel (Electrophorus electricus).

Authors:  Lindsay L Traeger; Jeremy D Volkening; Howell Moffett; Jason R Gallant; Po-Hao Chen; Carl D Novina; George N Phillips; Rene Anand; Gregg B Wells; Matthew Pinch; Robert Güth; Graciela A Unguez; James S Albert; Harold Zakon; Michael R Sussman; Manoj P Samanta
Journal:  BMC Genomics       Date:  2015-03-26       Impact factor: 3.969

8.  Genome-Wide Search for Tyrosine Phosphatases in the Human Genome Through Computational Approaches Leads to the Discovery of Few New Domain Architectures.

Authors:  Teerna Bhattacharyya; Ramanathan Sowdhamini
Journal:  Evol Bioinform Online       Date:  2019-04-09       Impact factor: 1.625

9.  The Physarum polycephalum Genome Reveals Extensive Use of Prokaryotic Two-Component and Metazoan-Type Tyrosine Kinase Signaling.

Authors:  Pauline Schaap; Israel Barrantes; Pat Minx; Narie Sasaki; Roger W Anderson; Marianne Bénard; Kyle K Biggar; Nicolas E Buchler; Ralf Bundschuh; Xiao Chen; Catrina Fronick; Lucinda Fulton; Georg Golderer; Niels Jahn; Volker Knoop; Laura F Landweber; Chrystelle Maric; Dennis Miller; Angelika A Noegel; Rob Peace; Gérard Pierron; Taeko Sasaki; Mareike Schallenberg-Rüdinger; Michael Schleicher; Reema Singh; Thomas Spaller; Kenneth B Storey; Takamasa Suzuki; Chad Tomlinson; John J Tyson; Wesley C Warren; Ernst R Werner; Gabriele Werner-Felmayer; Richard K Wilson; Thomas Winckler; Jonatha M Gott; Gernot Glöckner; Wolfgang Marwan
Journal:  Genome Biol Evol       Date:  2015-11-27       Impact factor: 3.416

10.  Whole Genome Duplications Shaped the Receptor Tyrosine Kinase Repertoire of Jawed Vertebrates.

Authors:  Frédéric G Brunet; Jean-Nicolas Volff; Manfred Schartl
Journal:  Genome Biol Evol       Date:  2016-06-03       Impact factor: 3.416

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.