Literature DB >> 22096504

A genome-wide survey on basic helix-loop-helix transcription factors in giant panda.

Chunwang Dang1, Yong Wang, Debao Zhang, Qin Yao, Keping Chen.   

Abstract

The giant panda (Ailuropoda melanoleuca) is a critically endangered mammalian species. Studies on functions of regulatory proteins involved in developmental processes would facilitate understanding of specific behavior in giant panda. The basic helix-loop-helix (bHLH) proteins play essential roles in a wide range of developmental processes in higher organisms. bHLH family members have been identified in over 20 organisms, including fruit fly, zebrafish, mouse and human. Our present study identified 107 bHLH family members being encoded in giant panda genome. Phylogenetic analyses revealed that they belong to 44 bHLH families with 46, 25, 15, 4, 11 and 3 members in group A, B, C, D, E and F, respectively, while the remaining 3 members were assigned into "orphan". Compared to mouse, the giant panda does not encode seven bHLH proteins namely Beta3a, Mesp2, Sclerax, S-Myc, Hes5 (or Hes6), EBF4 and Orphan 1. These results provide useful background information for future studies on structure and function of bHLH proteins in the regulation of giant panda development.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 22096504      PMCID: PMC3212526          DOI: 10.1371/journal.pone.0026878

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The basic helix-loop-helix (bHLH) proteins form a large superfamily of transcription factors that play crucial roles in a wide range of developmental processes including neurogenesis, myogenesis, hematopoiesis, sex determination and gut development. The bHLH domain is approximately 60 amino acids long and comprises a DNA-binding basic region (b) and two helices separated by a variable loop region (HLH) [1]. The HLH domain promotes dimerization, allowing the formation of homodimeric or heterodimeric complexes between different bHLH family members. The two basic domains which are brought together through dimerization bind specific hexanucleotide sequences. In the past two decades, protein functions of animal bHLH family members have been well characterized mainly through studies on bHLH proteins in model organisms including the nematode (Caenorhabditis elegans), fruit fly (Drosophila melanogaster) and mouse (Mus musculus). It has been established that animal bHLHs are classified into 45 families based on their different functions in the regulation of gene expression. In addition, they are divided into 6 groups according to target DNA elements they bind and their own structural characteristics. Specifically, group A consists of 22 families. They mainly regulate neurogenesis, myogenesis and mesoderm formation. Group B consists of 12 families. They mainly regulate cell proliferation and differentiation, sterol metabolism and adipocyte formation, and expression of glucose-responsive genes. Group C has 7 families. They are responsible for the regulation of midline and tracheal development, circadian rhythms, and for the activation of gene transcription in response to environmental toxins. Group D has only 1 family. It forms inactive heterodimers with group A bHLH proteins. Group E has 2 families, which regulate embryonic segmentation, somitogenesis and organogenesis etc. Group F also has 1 family. It regulates head development and formation of olfactory sensory neurons etc (reviewed in [2]). With the completion of genome sequencing projects for an increased number of organisms, bHLH family members have been identified in genomes of over 20 organisms. These include 8 bHLH genes in Saccharomyces cerevisiae, 16 in Amphimedon queenslandica, 33 in Hydra magnipapillata, 42 in Caenorhabditis elegans, 46 in Ciona intestinalis, 50 in Strongylocentrotus purpuratus, 51 in Apis mellifera, 52 in Bombyx mori, 57 in Daphia pulex, 59 in Drosophila melanogaster, 63 in Lottia gigantea, 64 in Capitella sp 1, 68 in Nematodtella vectensis, 78 in Branchiostoma floridae, 87 in Tetraodon nigroviridis, 104 in Gallus gallus, 114 in Mus musculus, 114 in Rattus norvegicus, 118 in Homo sapiens, 139 in Brachydanio rerio, 147 in Arabidopsis, and 167 in Oryza sativa [3]–[12]. The giant panda, Ailuropoda melanoleuca, is a critically endangered mammal confined in six isolated mountain ranges of South-western China [13]. As one of the most primitive carnivores, giant panda not only has unique food habit, but also has highly specialized reproductive behavior and low fertility [14], all of which signify that the giant panda has considerably different regulatory mechanisms in growth and development. However, very little is known on structure and function of regulatory genes in the growth and development of giant panda [15], [16]. As bHLH proteins present great importance in the regulation of organismal development, in this study, we have made exhaustive effort to obtain the complete list of bHLH family members encoded in the genome of giant panda. As a result, 107 bHLH family members were identified. Phylogenetic analyses with their mouse bHLH homologues revealed that the 107 giant panda bHLH members belong to 44 bHLH families with 46, 25, 15, 4, 11 and 3 members in group A, B, C, D, E and F, respectively, while 3 members were assigned into “orphan”. The present study provides useful background information for future studies on structure and function of bHLH proteins in the regulation of giant panda development.

Materials and Methods

Blast Searches

The sets of 45 representative bHLH domains and 114 mouse bHLH motifs were from the additional files of previous reports [4], [17], respectively. Each sequence of both sets was used as query sequence to perform tblastn search against the giant panda genome sequences which were accessed through the hyperlink provided on GenBank's MapView webpage (http://www.ncbi.nlm.nih.gov/mapview/). The expect value (E) was set at 10 in order to obtain all bHLH related sequences. The obtained subject sequences were manually examined to keep only one sequence for those that have the same contig number, reading frame and coding regions, to add the missing amino acids to corresponding sites with the help of EditSeq program (version 5.01) of the DNAStar package, and to find introns within the bHLH motifs using NetGene2 application online (http://www.cbs.dtu.dk/services/NetGene2/). Sequence accession numbers of giant panda bHLH proteins were obtained by using amino acids of each identified bHLH motif to conduct blastp search against giant panda protein sequence databases which were also accessed through the hyperlink on GenBank's MapView webpage.

Sequence Alignment

All sequences that had been improved by the above methods were aligned using ClustalW program embedded in MEGA4 [18] with default settings. Each sequence was examined for their amino acid residues at the 19 conserved sites by manual checking [19]. Sequences with less than nine variations were regarded as potential giant panda bHLH members. The sequences which have less than ten conserved amino acids were discarded and the rest sequences were aligned again using ClustalW. The aligned giant panda bHLH motifs were shaded in GeneDoc Multiple Sequence Alignment Editor and Shading Utility (Version 2.6.02) [20] and copied to rich text file for further annotation.

Phylogenetic Analyses

Phylogenetic analyses to all the identified giant panda bHLH members were carried out in two steps. First, all the obtained giant panda bHLH motif sequences were used to build neighbor-joining (NJ) distance tree with the 114 mouse bHLH motif sequences using PAUP 4.0 Beta 10 [21] based on a step matrix constructed from Dayhoff PAM 250 distance matrix by R. K. Kuzoff (http://paup.csit.fsu.edu/nfiles.html). Then, each giant panda bHLH motif sequence was used to conduct in-group phylogenetic analyses [9] with mouse bHLH motif sequences. That is, each amino acid sequence of giant panda bHLH motifs was used to construct NJ, maximum parsimony (MP), and maximum likelihood (ML) phylogenetic trees with mouse bHLH family members of the corresponding group, respectively. The NJ trees were bootstrapped with 1,000 replicates to provide information about their statistical reliability. MP analysis was performed using heuristic searches and bootstrapped with 100 replicates. ML trees were constructed using TreePuzzle 5.2 [22] with quartet-puzzling tree-search procedure and 25,000 puzzling steps. Model of substitution was set to the Jones-Taylor-Thornton [23]. Other parameters were set to default values.

Results and Discussion

Giant Panda bHLH Family Members

The tblastn searches, sequence alignment, and examination of the 19 conserved amino acid sites revealed that there were 107 bHLH genes encoded in giant panda genome. The names of all 107 giant panda bHLH members are listed in Table 1. Each identified giant panda bHLH (GpbHLH) gene was named according to nomenclature used by mouse bHLH sequences. The alignment of all 107 GpbHLH motifs is shown in Figure S1 and the phylogenetic tree constructed using amino acids from 107 GpbHLH motifs and 114 mouse bHLH motifs is shown in Figure S2. Figures S1 and S2 together show that there were 46, 25, 15, 4, 11 and 3 members in group A, B, C, D, E and F, respectively. And additional 3 members were assigned into “orphan”. We found that gene encoding for member of Delilah family was not found in the giant panda genome. In Figure S1, there are two most conserved sites located at sites 23 and 59 of the bHLH motif. Besides, there are other eight sites which are also conserved as indicated with asterisks on top of Figure S1 (amino acid sequences of all 107 giant panda bHLH motifs are available in file S1).
Table 1

A complete list of bHLH genes from giant panda.

FamilyGene nameMouse homologueBootstrap valuesProtein accession numberAnnotation in GenBank
NJMPML
ASCa GpAsh1 Mash1 999299XP_002915515.1Hypothetical protein
GpAsh2 Mash2 929197XP_002920180.1Ash2
ASCb GpAsh3a Mash3a 989990XP_002916197.1Ash3a
GpAsh3b Mash3b 9897100hmm367624.pHypothetical protein
GpAsh3c Mash3c 999771hmm285394.pHypothetical protein
MyoD GpMyoD MyoD 999794XP_002928807.1MyoD
GpMyoG MyoG 1009896XP_002925479.1MyoG
GpMyf5 Myf5 997778XP_002916822.1Myf5
GpMyf6 Myf6 998978XP_002916823.1Myf6
E12/E47 GpTF12 TF12 82n/m*n/m*XP_002920720.1TF12
GpE2A E2A 1009798XP_002923565.1E2A
GpKA1 KA1 658757Not available/
GpTCF4 TCF4 902182XP_002914713.1TCF4
Ngn GpAth4a Math4a 999597XP_002926036.1Neurogenin-2-like
GpAth4b Math4b 999399XP_002913770.1Neurogenin-3-like
GpAth4c Math4c 998599XP_002913012.1Neurogenin-1-like
NeuroD GpNDF1 NDF1 892780XP_002922319.1NDF1
GpNDF2 NDF2 886889XP_002916875.1NDF2
GpAth2 Math2 937789XP_002919308.1NDF6
GpAth3 Math3 999698XP_002930692.1NDF4
Atonal GpAth1 Math1 10099100XP_002915330.1Ath1
GpAth5 Math5 100100100XP_002913786.1Ath7
Mist GpMist1 Mist1 9997n/mNot available/
Beta3 GpBeta3b a Beta3b 10059n/m*XP_002925784.1Class E bHLH protein 23
Oligo GpOligo1 Oligo1 918598XP_002919636.1Oligo1
GpOligo2 Oligo2 885956XP_002919637.1Oligo2
GpOligo3 Oligo3 907398XP_002915132.1Oligo3
Net GpAth6 Math6 100100100XP_002928677.1Ath8
Mesp GpMesp1 a Mesp1 9978n/m*XP_002919616.1Mesp1
GpPMeso1 pMeso1 10010097XP_002915045.1Mesogenin-1
Twist GpTwist Twist 9164n/m*XP_002915415.1Hypothetical protein
GpDermo1 Dermo1 905590XP_002922521.1Twist-2-like
Paraxis GpParaxis Paraxis 787086XP_002925450.1Transcription factor 15
MyoRa GpMyoR MyoR 756484XP_002922844.1Musculin
GpPod1 Pod1 7825n/m*XP_002922333.1Transcription factor 21
MyoRb GpMyoRb1 MyoRb1 100100100XP_002916432.1Hypothetical protein
GpMyoRb2 MyoRb2 1009993XP_002913861.1Transcription factor 23
Hand GpDHand dHand 10064n/m*XP_002912726.1Hand 2
GpEHand eHand 999999XP_002917201.1Hand 1
PTFa GpPTFa PTFa 10010097XP_002913208.1Hypothetical protein
PTFb GpPTFb PTFb 10010099XP_002915418.1Fer3
SCL GpTal1 Tal1 1007185hmm534354.pHypothetical protein
GpTal2 Tal2 999488XP_002927719.1Tal2
GpLyl1 Lyl1 9999100XP_002921032.1Lyl1
NSCL GpHen1 a Hen1 10010089XP_002928490.1HLH protein-1-like
GpHen2 Hen2 6040n/m*XP_002925359.1HLH protein-2-like
SRC GpSRC1 SRC1 1009998XP_002913840.1NcoA 1
GpSRC2 SRC2 10010099XP_002918551.1NcoA 2
GpSRC3 SRC3 10010099XP_002927046.1NcoA 3
FIGα GpFiga Figa 100100100XP_002914962.1Figa
Myc GpN-Myc N-Myc 997085XP_002923116.1N-Myc
GpC-Myc C-Myc 100100100XP_002915028.1C-Myc
GpL-Myc L-Myc 10010099XP_002927604.1L-Myc
Mad GpMxi1 Mxi1 1009996XP_002930845.1Mxi1
GpMad1 Mad1 10010092XP_002914951.1Mad1
GpMad3 Mad3 1009891XP_002928184.1Mad3
GpMad4 Mad4 988791XP_002916603.1Mad4
Mnt GpMnt Mnt 10010099XP_002918069.1Mnt
Max GpMax Max 100100100XP_002914193.1Max
USF GpUSF1 USF1 10098100XP_002928795.1USF1
GpUSF2 USF2 923899XP_002920933.1USF2
MITF GpMITF MITF 75n/m*n/m*XP_002927657.1MITF
GpTFEb TFEb 10010098XP_002914561.1TFEb
GpTFEc TFEc 989796XP_002923929.1TFEc
GpTFE3 TFE3 6352n/mXP_002917800.1TFE3
SREBP GpSREBP1 SREBP1 10099n/m*XP_002923179.1SREBP1
GpSREBP2 SREBP2 100100100XP_002929331.1SREBP2
AP4 GpAP4 AP4 10010099XP_002924645.1AP4
MLX GpMlx Mlx 1009992XP_002923532.1WBSCR14
GpMondoA MondoA 10010097XP_002913172.1Mlx-interacting protein
TF4 GpTF4 TF4 100100100XP_002922185.1Max-like protein X
Clock GpClk Clk 10095100XP_002919413.1Clk
GpNPAS2 NPAS2 10099100XP_002919235.1NPAS2
ARNT GpARNT1 ARNT1 9761n/m*XP_002919403.1ARNT1
GpARNT2 ARNT2 968797XP_002919129.1ARNT2
Bmal GpBmal1 Bmal1 1009997XP_002926157.1ARNT-like protein 1
GpBmal2 a Bmal2 100100100XP_002917162.1ARNT-like protein 2
Sim GpSim1 Sim1 977496XP_002922016.1Sim1
GpSim2 Sim2 978390hmm348774.pHypothetical protein
AHR GpAHR1 AHR1 10099100XP_002917450.1AHR1
GpAHR2 AHR2 8164n/mXP_002926684.1AHRR
Trh GpNPAS3 NPAS3 1009182hmm740504.pHypothetical protein
HIF GpHif1a Hif1a 100100100XP_002913080.1Hif1a
GpHif3a Hif3a 100100100XP_002923099.1Hif3a
GpNPAS1 NPAS1 10010095XP_002923107.1NPAS1
GpEPAS1 EPAS1 10099100XP_002912483.1EPAS1
Emc GpId1 Id1 9357n/m*hmm387023.pHypothetical protein
GpId2 Id2 878256XP_002923275.1Id2
GpId3 Id3 9992100XP_002913316.1Id3
GpId4 Id4 1009076hmm7844.pHypothetical protein
Hey GpHerp1 Herp1 968696XP_002927896.1Herp1
GpHerp2 Herp2 9650n/m*XP_002915182.1Herp2
GpHEYL HEYL 989498XP_002930399.1HEYL
GpHey4 Hey4 10010092XP_002914075.1HELT-like protein
H/E(spl) GpDec1 Dec1 766799XP_002920034.1Class E bHLH protein 40
GpDec2 Dec2 73n/m*n/m*hmm164814.pHypothetical protein
GpHes1a Hes1 996684XP_002930213.1Hes4-like protein
GpHes1b Hes1 100100100XP_002923794.1Hes1
GpHes2 Hes2 976788XP_002923913.1Hes2
GpHes3 Hes3 1009798XP_002923915.1Hes3
GpHes7 Hes7 10010097hmm475304.pHypothetical protein
COE GpEBF1 EBF1 954558XP_002912553.1COE1
GpEBF2 EBF2 948956XP_002914472.1COE2
GpEBF3 EBF3 72n/m*n/m*XP_002922830.1COE3
Orphan GpOrphan2 Orphan2 10010059XP_002913251.1MAX gene-associated
GpOrphan3 Orphan3 100100n/m*XP_002923506.1Sohlh2-like protein
GpOrphan4 Orphan4 100100100Not available/

Note: Giant panda bHLH genes were named according to their mouse homologues. Bootstrap values were from in-group phylogenetic analyses with mouse bHLH motif sequences using NJ, MP, and ML algorithms, respectively. OsRa (the rice bHLH motif sequence of R family) was used as the outgroup in every constructed tree except those for GpHen1 and GpBmal2 which used separate outgroup sequence. n/m means that a giant panda bHLH does not form a monophyletic group with any other single bHLH motif sequence. n/m* means that a giant panda bHLH does not form a monophyletic clade with any specific bHLH motif sequence but forms a monophyletic clade with other bHLH proteins of the same family.

means that the gene's orthology was defined by in-group phylogenetic analyses with corresponding whole bHLH protein sequences from mouse. The accession numbers are from different protein resources. Those labeled as “XP” are from ‘RefSeq protein’ and those labeled as “hmm” were from ‘Ab initio protein’ databases of giant panda.

Note: Giant panda bHLH genes were named according to their mouse homologues. Bootstrap values were from in-group phylogenetic analyses with mouse bHLH motif sequences using NJ, MP, and ML algorithms, respectively. OsRa (the rice bHLH motif sequence of R family) was used as the outgroup in every constructed tree except those for GpHen1 and GpBmal2 which used separate outgroup sequence. n/m means that a giant panda bHLH does not form a monophyletic group with any other single bHLH motif sequence. n/m* means that a giant panda bHLH does not form a monophyletic clade with any specific bHLH motif sequence but forms a monophyletic clade with other bHLH proteins of the same family. means that the gene's orthology was defined by in-group phylogenetic analyses with corresponding whole bHLH protein sequences from mouse. The accession numbers are from different protein resources. Those labeled as “XP” are from ‘RefSeq protein’ and those labeled as “hmm” were from ‘Ab initio protein’ databases of giant panda.

Identification of Orthologous Families

Ortholog identification has had much uncertainty since there is no absolute criterion that can be used to decide whether two genes are orthologous [17]. In our previous studies [9], [10], in-group phylogenetic analysis was adopted to identify homologues for the unknown sequences that would form a monophyletic clade among themselves by using a more certain criterion based on the criterion used by Ledent et al. [17], [24]: If an unknown single giant panda bHLH forms a monophyletic clade with another bHLH of known family in phylogenetic trees constructed with different methods and all the bootstrap values exceed 50, the known member will be regarded as a homologue of the unknown sequence. Figure S3, as an example here, shows NJ, MP and ML phylogenetic trees constructed with one GpbHLH member (GpAsh1) and eight group A bHLH members from mouse. In all three trees, GpAsh1 formed monophyletic clade with Mash1 of mouse with bootstrap values ranging from 92 to 100. Therefore, GpAsh1 was considered as an ortholog of Mash1 of mouse. The similar in-group phylogenetic analyses were conducted to each of the identified GpbHLH members by referencing Figure S2 to select appropriate related mouse bHLH members for the analysis. All the bootstrap values of constructed NJ, MP and ML trees were listed in Table 1 without showing the correspondent constructed trees. Table 1 showed that the orthology of GpbHLH members with mouse can be divided into the following categories. Firstly, among the 107 GpbHLH members, 83 GpbHLH members have all the bootstrap values over 50 (55≦bootstrap values≦100) in constructed NJ, MP and ML trees. We have sufficient confidence to define orthology of these GpbHLH motifs to their corresponding mouse bHLH orthologs. Secondly, 4 GpbHLH members, namely GpTCF4, GpNDF1, GpUSF2 and GpEBF1, formed monophyletic clade with bootstrap values over 50 in NJ and ML trees. Although they also formed monophyletic clade in MP trees, their bootstrap values ranged from 21 to 45. Therefore, the orthology of these 4 GpbHLH members have been defined according to the statistical support from NJ and ML trees. And 10 GpbHLH members, namely GpMist1, GpAHR2, GpTwist, GpDHand, GpARNT1, GpSREBP1, GpId1, GpHerp2 and GpOrphan3, formed monophyletic clade with bootstrap values ranging from 50 to 100 in NJ and MP trees, but did not form monophyletic group with any single bHLH sequence in ML trees (marked with n/m* or n/m in Table 1). For these 9 GpbHLH members, we have defined their orthology according to the statistical support from NJ and MP trees. Thirdly, 2 GpbHLH members, namely GpPod1 and GpHen2 formed monophyletic clade in NJ and MP trees with bootstrap values ranging from 20 to 79 but did not form monophyletic group in ML tree. And 4 other GpbHLH members, namely GpTF12, GpMITF, GpDec2 and GpEBF3, formed monophyletic clade with bootstrap values ranging from 72 to 82 in NJ tree, but did not form monophyletic clade in MP and ML trees. Although these 6 GpbHLH members did not have sufficient bootstrap support, we defined orthologs for them because they all have one or two bootstrap support to testify their orthology to the correspondent mouse ortholog. This phylogenetic divergence of bHLH motif sequences between giant panda and mouse probably means that these two mammals have evolved in quite different circumstances. Finally, there are 4 GpbHLH sequences which did not form monophyletic clade with most of the mouse bHLH motif sequences in all constructed phylogenetic trees. They are GpBeta3b, GpMesp1, GpHen1 and GpBmal2 of which whole protein sequences were used to conduct in-group phylogenetic analyses with whole sequences of corresponding mouse bHLH proteins for defining their orthology (marked with a in Table 1).

Protein Sequences and Genomic Coding Regions of Giant Panda bHLH Genes

Protein sequence accession numbers of all the identified GpbHLH motifs were listed in Table 1. It was found that there are 95 GpbHLH motifs of which protein sequence accession numbers were found in ‘Non-RefSeq protein’ database (shown as ‘XP’ plus number). Protein sequence accession numbers of 9 GpbHLH motifs were only found in ‘Ab initio protein’ database in which all protein sequences were predicted from their corresponding genomic sequences (shown as ‘hmm’ plus number). They are GpAsh3b, GpAsh3c, GpTal1, GpSim2, GpNPAS3, GpId1, GpId4, GpDec2 and Hes7, respectively. There are also 3 GpbHLH protein sequences of which accession numbers were not found in any protein databases. They are GpKA1, GpMist1 and GpOrphan4, respectively. Table 1 showed that, among the 104 bHLH protein sequences deposited in giant panda databases, 58 were annotated in full agreement with our analytical result (shown as the same name in the column of “annotation in GenBank” with that in the column of “gene name”), 33 were annotated differently with our result (shown as a different name in the column of “annotation in GenBank” with that in the column of “gene name”), and 13 were merely predicted proteins (indicated as “hypothetical protein”). Therefore, our work not only newly identified the 13 protein sequences as bHLH family members but also provided additional information for further investigations on the 33 differently annotated bHLH protein sequences. The coding regions and intron analysis for 107 giant panda bHLH motifs are listed in Table 2. The data of intron analyses showed that there are 47 GpbHLH members with introns in their bHLH motifs. It was found that: (i) 26 GpbHLH members have one intron, among which 13 GpbHLH members have introns in the basic region, 12 have introns in the loop region, and 1 has introns in the helix 2 region. (ii) 19 GpbHLH members have two introns, among which 15 have introns in the basic and loop regions respectively, 3 have introns in the basic and helix 2 regions respectively, and 1 has introns in the helix1 and helix 2 regions respectively. (iii) 2 GpbHLH members have three introns among which two were located in basic region and one was located in helix 2 region. There are altogether 70 introns being identified in the 47 GpbHLH motifs. The longest intron in GpbHLH motif is 45,217 bp (base pairs), and the average length of intron is 4,393 bp. These data are comparable with those of mouse. In mouse, there are also 47 bHLH members having introns in their bHLH motifs. The total number of introns identified is 73, with the longest one of 48,288 bp and the average length of 4,286 bp (data not shown).
Table 2

Coding regions, intron location and length of 107 giant panda bHLH motifs.

FamilyGene nameGenomic coding sequence(s)Intron (location: length)
Contig No.FrameCoding region(s)
ASCa GpAsh1 NW_003217384.1+1937516-937674
GpAsh2 NW_003217681.1−21024199-1024041
ASCb GpAsh3a NW_003217414.1+11397734-1397892
GpAsh3b NW_003217343.1+31118238-1118387
GpAsh3c NW_003217785.1−31162741-1162583
MyoD GpMyoDNW_003218874.1+347166-47321
GpMyoG NW_003218226.1−1585332-585177
GpMyf5 NW_003217445.1−1705066-704911
GpMyf6 NW_003217445.1−3714220-714065
E12/E47 GpTF12 NW_003217723.1−3878591-878430
GpE2A NW_003217991.1−2930301-930140
GpKA1 NW_003217991.1−1932483-932322
GpTCF4 NW_003217346.1+11744351-1744512
Ngn GpAth4a NW_003218321.1−324207-24049
GpAth4b NW_003217318.1+12044831-2044989
GpAth4c NW_003217300.1−34043793-4043635
NeuroD GpNDF1 NW_003217851.1+3616848-617006
GpNDF2 NW_003217447.1+218473-18631
GpAth2 NW_003217612.1−11092508-1092350
GpAth3 NW_003219813.1−340317-40159
Atonal GpAth1 NW_003217374.1+2262304-262462
GpAth5 NW_003217318.1+33168366-3168524
Mist GpMist1 NW_003218585.1−1220417-220259
Beta3 GpBeta3b NW_003218276.1+1208657-208821
Oligo GpOligo1 NW_003217632.1−21158013-1157843
GpOligo2 NW_003217632.1−31198932-1198768
GpOligo3 NW_003217365.1−32603795-2603631
Net GpAth6 NW_003218843.1+361194-61271Loop: 8,330 bp
+269602-69682
Mesp GpMesp1 NW_003217631.1−31622851-1622687
GpPMeso1 NW_003217360.1+3779709-779873
Twist GpTwist NW_003217378.1−21057777-1057622
GpDermo1 NW_003217871.1−31180296-1180141
Paraxis GpParaxis NW_003218220.1−1310143-310117Basic: 29,249 bp
−3280867-280736
MyoRa GpMyoR NW_003217902.1−2213023-212865
GpPod1 NW_003217853.1+1879529-879687
MyoRb GpMyoRb1 NW_003217428.1−32043881-2043723
GpMyoRb2 NW_003217319.1−1759695-759537
Hand GpDHand NW_003217296.1+24789691-4789849
GpEHand NW_003217471.1+1889240-889398
PTFa GpPTFa NW_003217305.1+23160973-3161131
PTFb GpPTFb NW_003217378.1−21090543-1090385
SCL GpTal1 NW_003218755.1+2402998-403156
GpTal2 NW_003218606.1+3375012-375170
GpLyl1 NW_003217749.1−1440560-440402
NSCL GpHen1 NW_003218810.1−1310980-310822
GpHen2 NW_003218212.1+3118701-118850No intron (two separate contigs)
NW_003222115.1−12942-2934
SRC GpSRC1 NW_003217319.1−32757930-2757913Basic: 6,972 bp
−32750940-2750785
GpSRC2 NW_003217553.1−11113592-1113587Basic: 2,669 bp
−31110917-1110750
GpSRC3 NW_003218488.1−3105991-105983Basic: 876 bp
−3105106-104942
FIGα GpFiga NW_003217356.1+32013393-2013428Basic: 5,738 bp
+22019167-2019289
Myc GpN-Myc NW_003217942.1−3698202-698044
GpC-Myc NW_003217359.1−32278151-2277993
GpL-Myc NW_003218583.1−2137657-137499
Mad GpMxi1 NW_003219956.1−180826-80792Basic: 29,318 bp
−350884-50770Helix 2: 613 bp
−150156-50148
GpMad1 NW_003217356.1−12726736-2726732Basic: 4,015 bp
−32721716-2721687Basic: 14,277 bp
−32707409-2707295Helix 2: 1,545 bp
−32705749-2705741
GpMad3 NW_003218733.1+2210165-210168Basic: 647 bp
+2210816-210841Basic: 128 bp
+1210970-211089Helix 2: 2,020 bp
+2213110-213118
GpMad4 NW_003217437.1−1395863-395833Basic: 563 bp
−2390270-390154Helix 2: 948 bp
−2389205-389197
Mnt GpMnt NW_003217523.1+3590949-590983Basic: 142 bp
+1591126-591237Helix 2: 5,001 bp
+1596239-596247
Max GpMax NW_003217331.1+31173984-1174085Loop: 13,160 bp
+21187246-1187299
USF GpUSF1 NW_003218870.1+3285504-285525Basic: 121 bp
+1285647-285741Loop: 250 bp
+3285990-286040
GpUSF2 NW_003217744.1+3363972-363992Basic: 6,252 bp
+1370243-370338Loop: 103 bp
+2370442-370492
MITF GpMITF NW_003218591.1+3401976-401997Basic: 5,362 bp
+1407360-407436Loop: 3,549 bp
+1410986-411048
GpTFEb NW_003217341.1+33067443-3067464Basic: 617 bp
+23068082-3068158Loop: 644 bp
+13068803-3068865
GpTFEc NW_003218029.1+2296780-296801Basic: 3,864 bp
+2300666-300741Loop: 4,628 bp
+1305370-305433
GpTFE3 NW_003217513.1+2815657-815678Basic: 172 bp
+3815851-815927Loop: 1,124 bp
+2817052-817114
SREBP GpSREBP1 NW_003217951.1+1876259-876357Loop: 642 bp
+1877000-877053
GpSREBP2 NW_003219081.1−242239-42141Loop: 1,224 bp
−240916-40863
AP4 GpAP4 NW_003218113.1+3891114-891224Loop: 117 bp
+3891342-891386
MLX GpMlx NW_003217989.1−2869754-869644Loop: 186 bp
−2869457-869404
GpMondoA NW_003217304.1+12129887-2129982Loop: 108 bp
+12130091-2130147
TF4 GpTF4 NW_003217842.1+2991667-991717Helix 1: 260 bp
+1991978-992077Helix 2: 194 bp
+3992272-992297
Clock GpClk NW_003217618.1−31508321-1508314Basic: 3,666 bp
−31504647-1504503
GpNPAS2 NW_003217607.1−1787626-787622Basic: 236 bp
−3767385-767238
ARNT GpARNT1 NW_003217617.1+11009551-1009555Basic: 2,103 bp
+31011659-1011815
GpARNT2 NW_003217598.1−11385223-1385219Basic: 7,467 bp
−21377751-1377595
Bmal GpBmal1 NW_003218335.1+2511835-511839Basic: 1,225 bp
+3513065-513221
GpBmal2 NW_003217468.1−2955835-955831Basic: 3,125 bp
−3952705-952549
Sim GpSim1 NW_003217828.1+1640339-640500
GpSim2 NW_003217464.1+22035250-2035411
AHR GpAHR1 NW_003217483.1+21606550-606711
GpAHR2 NW_003218420.1−1471842-471681
Trh GpNPAS3 NW_003219637.1−262695-2534
HIF GpHif1a NW_003217302.1+24505891-4506052
GpHif3a NW_003217939.1−21043816-1043655
GpNPAS1 NW_003217939.1−2538490-538344
GpEPAS1 NW_003217290.1+23566561-3566722
Emc GpId1 NW_003217538.1−31629899-1629801
GpId2 NW_003217962.1+1246616-246714
GpId3 NW_003217307.1−2256780-256682
GpId4 NW_003218297.1+3402159-402257
Hey GpHerp1 NW_003218647.1+3207195-207212Basic: 125 bp
+2207338-207421Loop: 276 bp
+2207698-207763
GpHerp2 NW_003217369.1−11395712-1395695Basic: 132 bp
−11395562-1395479Loop: 2,267 bp
−31393211-1393446
GpHEYL NW_003219577.1−1109083-109066Basic: 1,029 bp
−1108036-107953Loop: 1,266 bp
−3106684-106619
GpHey4 NW_003217325.1+33706089-3706190Loop: 271 bp
+13706462-3706527
H/E(spl) GpDec1 NW_003217667.1+1608623-608724Loop: 941 bp
+3609666-609731
GpDec2 NW_003217468.1+32062578-2062679Loop: 321 bp
+12063002-2063067
GpHes1a NW_003219474.1−2115790-115785Basic: 82 bp
−3115702-115607Loop: 96 bp
−3115510-115439
GpHes1b NW_003218013.1−1184110-184105Basic: 135 bp
−1183969-183874Loop: 194 bp
−3183679-183608
GpHes2 NW_003218027.1+3663288-663293Basic: 86 bp
+2663380-663475Loop: 208 bp
+3663684-663755
GpHes3 NW_003266724.1−3806524-806423Loop: 101 bp
−2806321-806253
GpHes7 NW_003217758.1+21201700-1201801Loop: 589 bp
+31202391-1202462
COE GpEBF1 NW_003217293.1−12621749-2621723Basic: 45,217 bp
−22576505-2576416Loop: 16,756 bp
−32559659-2559615
GpEBF2 NW_003217339.1+32341320-2341347Basic: 19,688 bp
+22361036-2361124Loop: 1,270 bp
+32362395-2362439
GpEBF3 NW_003217894.1+2583319-583346Basic: 17,574 bp
+2600921-601009Loop: 5,187 bp
+2606197-606241
Orphan GpOrphan2 NW_003217306.1−22667874-2667746Helix 2: 1,301 bp
−12666444-2666418
GpOrphan3 NW_003217988.1+182414-82451Basic: 16,940 bp
+399392-99509
GpOrphan4 NW_003218276.1+2325310-325347Basic: 1,985 bp
+1327333-327441

Note: Basic, helix 1, loop, and helix 2 regions are delineated as shown in Figure S1.

Note: Basic, helix 1, loop, and helix 2 regions are delineated as shown in Figure S1.

The Giant Panda bHLH Repertoire

Compared to the 114 bHLH family members of mouse, it was found that the giant panda has one less member in each of the 7 bHLH families namely Beta3, Mesp, Paraxis, Myc, Hes, COE and Orphan. The missing bHLH family members are Beta3a, Mesp2, Sclerax, S-Myc, Hes5 (or Hes6), EBF4 and Orphan 1, respectively. Based on the available data, it is difficult to say whether giant panda does lack these bHLH genes. At present, there are three mammalian species (human, mouse and rat) of which bHLH family members have been identified and classified [4], [7]. While human has different members with mouse and rat in only 2 bHLH families, i.e. Myc and H/E(spl), it is hard to believe that giant panda could have different members in 7 bHLH families. Moreover, among the 7 family members missing in giant panda, zebrafish and chicken are found to lack only one (S-Myc) and two (S-Myc and EBF4) members, respectively [11], [12]. Therefore, it is thought that additional bHLH members may be found after a new and higher quality version of giant panda genome sequence is released. Nevertheless, given that very little information is available on bHLH genes and their functions among bear speices, our data provide a good background information for further studies on regulatory functions of bHLH proteins in giant panda and other bear species. Alignment of 107 giant panda bHLH family members. Designation of basic, helix 1, loop and helix 2 follows Ferre-D'Amare et al. [25]. The family names and high-order groups have been organized according to Table 1 of Ledent et al. [24]. Highly conserved sites are indicated with asterisks on the top. The first five amino acids of NPAS1 were not available due to incompleteness of the correspondent genomic contig sequences. (TIF) Click here for additional data file. Phylogenetic relationship of 107 giant panda and 114 mouse bHLH members. The tree was constructed with neighbor-joining algorithm with OsRa (the rice bHLH motif sequence of R family) as outgroup. For simplicity, branch lengths of the tree are not proportional to distances between sequences, and bootstrap values less than 50 are not shown. The higher-order group labels are in accordance with Ledent et al. [24]. (TIF) Click here for additional data file. In-group phylogenetic analyses of GpAsh1. (a), (b) and (c) are NJ, MP and ML trees constructed with one giant panda bHLH member (GpAsh1) and nine group A bHLH members from mouse, respectively. In all trees, OsRa was used as the outgroup. (TIF) Click here for additional data file. Amino acid sequences of 107 giant panda bHLH motifs. The giant panda bHLH family members are arranged as those in Tables 1 and 2, in which their family assignment, protein and coding region information can be found accordingly. (DOC) Click here for additional data file.
  22 in total

Review 1.  [Progress of studies on family members and functions of animal bHLH transcription factors].

Authors:  Yong Wang; Qin Yao; Ke-Ping Chen
Journal:  Yi Chuan       Date:  2010-04

2.  The rapid generation of mutation data matrices from protein sequences.

Authors:  D T Jones; W R Taylor; J M Thornton
Journal:  Comput Appl Biosci       Date:  1992-06

3.  Genome-wide analysis of basic/helix-loop-helix transcription factor family in rice and Arabidopsis.

Authors:  Xiaoxing Li; Xuepeng Duan; Haixiong Jiang; Yujin Sun; Yuanping Tang; Zheng Yuan; Jingkang Guo; Wanqi Liang; Liang Chen; Jingyuan Yin; Hong Ma; Jian Wang; Dabing Zhang
Journal:  Plant Physiol       Date:  2006-08       Impact factor: 8.340

4.  Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain.

Authors:  A R Ferré-D'Amaré; G C Prendergast; E B Ziff; S K Burley
Journal:  Nature       Date:  1993-05-06       Impact factor: 49.962

5.  Interferon-gamma of the giant panda (Ailuropoda melanoleuca): complementary DNA cloning, expression, and phylogenetic analysis.

Authors:  Yaqiong Tao; Bo Zeng; Liu Xu; Bisong Yue; Dong Yang; Fangdong Zou
Journal:  DNA Cell Biol       Date:  2010-01       Impact factor: 3.311

6.  Phylogenetic analysis of zebrafish basic helix-loop-helix transcription factors.

Authors:  Yong Wang; Keping Chen; Qin Yao; Xiaodong Zheng; Zhe Yang
Journal:  J Mol Evol       Date:  2009-05-16       Impact factor: 2.395

7.  The basic helix-loop-helix transcription factor family in Bombyx mori.

Authors:  Yong Wang; Keping Chen; Qin Yao; Wenbing Wang; Zhu Zhi
Journal:  Dev Genes Evol       Date:  2007-10       Impact factor: 0.900

8.  A genome-wide survey on basic helix-loop-helix transcription factors in rat and mouse.

Authors:  Xiaodong Zheng; X Zheng; Yong Wang; Y Wang; Qin Yao; Q Yao; Zhe Yang; Z Yang; Keping Chen; K Chen
Journal:  Mamm Genome       Date:  2009-03-21       Impact factor: 2.957

9.  The basic helix-loop-helix transcription factor family in the honey bee, Apis mellifera.

Authors:  Yong Wang; Keping Chen; Qin Yao; Wenbing Wang; Zhi Zhu
Journal:  J Insect Sci       Date:  2008       Impact factor: 1.857

10.  A compendium of Caenorhabditis elegans regulatory transcription factors: a resource for mapping transcription regulatory networks.

Authors:  John S Reece-Hoyes; Bart Deplancke; Jane Shingles; Christian A Grove; Ian A Hope; Albertha J M Walhout
Journal:  Genome Biol       Date:  2005-12-30       Impact factor: 13.583

View more
  9 in total

1.  Genome-wide identification, classification and functional analyses of the bHLH transcription factor family in the pig, Sus scrofa.

Authors:  Wuyi Liu
Journal:  Mol Genet Genomics       Date:  2015-02-17       Impact factor: 3.291

2.  Genome-wide identification and analysis of basic helix-loop-helix domains in dog, Canis lupus familiaris.

Authors:  Xu-Hua Wang; Yong Wang; A-Ke Liu; Xiao-Ting Liu; Yang Zhou; Qin Yao; Ke-Ping Chen
Journal:  Mol Genet Genomics       Date:  2014-11-18       Impact factor: 3.291

3.  Classification and evolutionary analysis of the basic helix-loop-helix gene family in the green anole lizard, Anolis carolinensis.

Authors:  Ake Liu; Yong Wang; Debao Zhang; Xuhua Wang; Huifang Song; Chunwang Dang; Qin Yao; Keping Chen
Journal:  Mol Genet Genomics       Date:  2013-06-12       Impact factor: 3.291

4.  Genome-wide analysis of the bHLH transcription factor family in Chinese cabbage (Brassica rapa ssp. pekinensis).

Authors:  Xiao-Ming Song; Zhi-Nan Huang; Wei-Ke Duan; Jun Ren; Tong-Kun Liu; Ying Li; Xi-Lin Hou
Journal:  Mol Genet Genomics       Date:  2013-11-17       Impact factor: 3.291

5.  A genome-wide identification and analysis of the basic helix-loop-helix transcription factors in the ponerine ant, Harpegnathos saltator.

Authors:  Ake Liu; Yong Wang; Chunwang Dang; Debao Zhang; Huifang Song; Qin Yao; Keping Chen
Journal:  BMC Evol Biol       Date:  2012-08-31       Impact factor: 3.260

6.  Phylogeny, functional annotation, and protein interaction network analyses of the Xenopus tropicalis basic helix-loop-helix transcription factors.

Authors:  Wuyi Liu; Deyu Chen
Journal:  Biomed Res Int       Date:  2013-11-10       Impact factor: 3.411

7.  A Genome-Wide Identification and Analysis of the Basic Helix-Loop-Helix Transcription Factors in Brown Planthopper, Nilaparvata lugens.

Authors:  Pin-Jun Wan; San-Yue Yuan; Wei-Xia Wang; Xu Chen; Feng-Xiang Lai; Qiang Fu
Journal:  Genes (Basel)       Date:  2016-11-18       Impact factor: 4.096

8.  A genome-wide identification of basic helix-loop-helix motifs in Pediculus humanus corporis (Phthiraptera: Pediculidae).

Authors:  Xu-Hua Wang; Yong Wang; De-Bao Zhang; A-Ke Liu; Qin Yao; Ke-Ping Chen
Journal:  J Insect Sci       Date:  2014-01-01       Impact factor: 1.857

9.  Phylogenetics of Lophotrochozoan bHLH Genes and the Evolution of Lineage-Specific Gene Duplicates.

Authors:  Yongbo Bao; Fei Xu; Sebastian M Shimeld
Journal:  Genome Biol Evol       Date:  2017-04-01       Impact factor: 3.416

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.