| Literature DB >> 20454632 |
Abstract
Members of the basic helix-loop-helix (bHLH) family of transcription factors play important roles in a wide range of developmental processes. In this study, we conducted a genome-wide survey using the chicken (Gallus gallus) genomic database, and identified 104 bHLH sequences belonging to 42 gene families in an effort to characterize the chicken bHLH transcription factor family. Phylogenetic analyses revealed that chicken has 50, 21, 15, 4, 8, and 3 bHLH members in groups A, B, C, D, E, and F, respectively, while three members belonging to none of these groups were classified as ''orphans". A comparison between chicken and human bHLH repertoires suggested that both organisms have a number of lineage-specific bHLH members in the proteomes. Chromosome distribution patterns and phylogenetic analyses strongly suggest that the bHLH members should have arisen through gene duplication at an early date. Gene Ontology (GO) enrichment statistics showed 51 top GO annotations of biological processes counted in the frequency. The present study deepens our understanding of the chicken bHLH transcription factor family and provides much useful information for further studies using chicken as a model system.Entities:
Year: 2010 PMID: 20454632 PMCID: PMC2862960 DOI: 10.1155/2010/682095
Source DB: PubMed Journal: Comp Funct Genomics ISSN: 1531-6912
A complete list of 104 bHLH genes from chicken (Gallus gallus) with the corresponding human homologue information.
| Group | Family | Gallus gallus | Protein ID (GenBank Accession number) | Homo sapiens | BI posterior probability (%)a | ML Bootstrap value (%)b | Genome contig link |
|---|---|---|---|---|---|---|---|
| A | ASCa |
| NP_989743.1 |
| 83 | 71 | NW_001471698.1 |
| A | ASCa |
| NP_990280.1 |
| 93 | n/m* | NW_001471698.1 |
| A | ASCb |
| XP_001232099.1 (ASCL3 transcript variant 1); XP_420985.2 (ASCL3 transcript variant 2) |
| 100 | 89 | NW_001471698.1 |
| A | ASCb |
| XP_425485.1 |
| 51 | 89 | NW_001471513.1 |
| A | MyoD |
| NP_989545.1 |
| 88 | 95 | NW_001471698.1 |
| A | MyoD |
| NP_989515.1 |
| 100 | 94 | NW_001471608.1 |
| A | MyoD |
| NP_001025534.1 |
| 75 | 96 | NW_001471512.1 |
| A | MyoD |
| NP_001025917.1 |
| 93 | 99 | NW_001471512.1 |
| A | E12/E47 |
| NP_990706.2 |
| 54 | 78 | NW_001471425.1 |
| A | E12/E47 |
| hmm39106 |
| 54 | 78 | NW_001471425.1 |
| A | E12/E47 |
| hmm9164 |
| 96 | 98 | NW_001471627.1 |
| A | E12/E47 |
| NP_989817.2 |
| 98 | 97 | NW_001471627.1 |
| A | E12/E47 |
| Q90683.1 |
| 55 | n/m* | NW_001488824.1 |
| A | Ngn |
| NP_990127.1 |
| 99 | 94 | NW_001471685.1 |
| A | Ngn |
| NP_990214.1 |
| 100 | 90 | NW_001471449.1 |
| A | NeuroD |
| NP_990251.1 |
| 55 | n/m* | NW_001471729.1 |
| A | NeuroD |
| XP_418852.1 |
| 97 | 89 | NW_001471633.1 |
| A | NeuroD |
| NP_990407.1 |
| 99 | 94 | NW_001471747.1 |
| A | Atonal |
| hmm54472 |
| 100 | 87 | NW_001471683.1 |
| A | Atonal |
| XR_026796.1 |
| 100 | 87 | NW_001471683.1 |
| A | Atonal |
| NP_989999.1 |
| 99 | 91 | NW_001471715.1 |
| A | Mist |
| XP_425228.1 |
| 100 | 98 | NW_001471454.1 |
| A | Beta3 |
| NP_989835.1 |
| 57 | 62 | NW_001471567.1 |
| A | Beta3 |
| NP_989834.1 |
| 95 | 76 | NW_001471646.1 |
| A | Oligo |
| NP_001026697.1 |
| 67 | 62 | NW_001471669.1 |
| A | Oligo |
| XP_001232806.1 |
| 84 | 76 | NW_001471669.1 |
| A | Net |
| XP_001234980.1 |
| 96 | 98 | NW_001471687.1 |
| A | Mesp |
| hmm11657 |
| n/m | n/m | NW_001471429.1 |
|
| |||||||
| A | Mesp |
| NP_989897.1 |
| n/m | n/m | NW_001471429.1 |
|
| |||||||
| A | Mesp |
| hmm17962 |
| n/m | n/m | NW_001471429.1 |
|
| |||||||
| A | Mesp |
| XP_001231219.1 |
| n/m | n/m | NW_001471429.1 |
|
| |||||||
| A | Mesp |
| NP_990015.1 |
| n/m | n/m | NW_001471673.1 |
|
| |||||||
| A | Twist |
| NP_990070.1 |
| 96 | 82 | NW_001471633.1 |
| A | Twist |
| NP_990010.1 |
| 98 | 92 | NW_001471728.1 |
| A | Twist |
| NP_001096684.1 |
| 100 | 98 | NW_001471747.1 |
| A | Twist |
| XP_424492.1 |
| 100 | 98 | NW_001471747.1 |
| A | Paraxis |
| NP_990277.1 |
| 79 | 74 | NW_001471567.1 |
| A | Paraxis |
| NP_989584.1 |
| 95 | 92 | NW_001471733.1 |
| A | Paraxis |
| XP_001234790.1 |
| 91 | 97 | NW_001471733.1 |
| A | MyoRa |
| XP_418293.2 |
| 80 | 79 | NW_001471650.1 |
| A | MyoRa |
| XP_419734.1 |
| 100 | n/m* | NW_001471669.1 |
| A | MyoRb |
| XP_427081.2 |
| 85 | n/m* | NW_001471649.1 |
| A | Hand |
| NP_990296.1 |
| 99 | 91 | NW_001471449.1 |
| A | Hand |
| NP_990297.1 |
| 100 | 98 | NW_001471685.1 |
| A | PTFa |
| XP_425989.1 |
| 100 | 98 | NW_001471633.1 |
| A | PTFb |
| XP_001234487.1 |
| 99 | 95 | NW_001471728.1 |
| A | SCL |
| NP_990683.1 |
| 60 | 62 | NW_001471740.1 |
| A | SCL |
| XP_424886.1 |
| 99 | 82 | NW_001488876.1 |
| A | NSCL |
| NP_989452.1 |
| 100 | 99 | NW_001471598.1 |
| A | NSCL |
| NP_990128.1 |
| 72 | 85 | NW_001471526.1 |
| B | SRC |
| NP_001012900.1 |
| 91 | 98 | NW_001471673.1 |
| B | SRC |
| XP_001231617.1 |
| 100 | 98 | NW_001471649.1 |
| B | SRC |
| XP_417385.2 |
| 99 | 86 | NW_001471567 |
| B | MYC |
| NP_001026262.1 |
| 100 | 89 | NW_001471673.1 |
| B | MYC |
| NP_001026123.1 |
| 100 | 56 | NW_001471654.1 |
| B | MYC |
| XP_425790.1 |
| 98 | 98 | NW_001471589.1 |
| B | Mad |
| NP_001034399.1 |
| 98 | 96 | NW_001471581.1 |
| B | Mad |
| NP_001012929.1 |
| 98 | 74 | NW_001471720.1 |
| B | Mad |
| NP_001006460.1 |
| 100 | 85 | NW_001471687.1 |
| B | Mnt |
| XP_425414.2 |
| 98 | 68 | NW_001471508.1 |
| B | MAX |
| P52162.1 |
| 100 | 91 | NW_001471508.1 |
| B | USF |
| NP_001007486.1 |
| 92 | 82 | NW_001474499.1 |
| B | MITF |
| NP_990360.1 |
| 100 | 64 | NW_001471443.1 |
| B | MITF |
| NP_001026093.1 |
| 100 | 96 | NW_001471610.1 |
| B | MITF |
| NP_001006229.1 |
| 100 | 71 | NW_001471512.1 |
| B | SREBP1 |
| NP_989457.1 |
| 100 | 96 | NW_001471454.1 |
| B | SREBP2 |
| XP_416222.2 |
| 100 | 99 | NW_001471513.1 |
| B | Mlx |
| NP_001104311.1 |
| 96 | n/m* | NW_001471508.1 |
| B | Mlx |
| hmm20496 |
| 96 | n/m* | NW_001471508.1 |
| B | Mlx |
| hmm54830 |
| 100 | 91 | NW_001471459.1 |
| B | TF4 |
| NP_001026101.1 |
| 100 | 83 | NW_001471622.1 |
| C | Clock |
| NP_989505.2 |
| 98 | 87 | NW_001471686.1 |
| C | Clock |
| NP_001025713.1 |
| 100 | 97 | NW_001471545.1 |
| C | Clock |
| XP_420353.2 |
| 100 | 99 | NW_001471681.1 |
| C | ARNT |
| NP_989531.1 |
| 100 | 100 | NW_001471606.1 |
| C | ARNT |
| XP_413854.2 |
| 100 | 100 | NW_001471428.1 |
| C | Bmal |
| NP_001001463.1 |
| 71 | 85 | NW_001471698.1 |
| C | Bmal |
| NP_989464.1 |
| 100 | n/m* | NW_001471513.1 |
| C | AHR |
| hmm34307 |
| 68 | 94 | NW_001471728.1 |
| C | AHR |
| hmm34113 |
| 68 | 94 | NW_001471728.1 |
| C | AHR |
| hmm46108 |
| 70 | 90 | NW_001471639.1 |
| C | Sim |
| XP_419817.2 |
| 74 | n/m* | NW_001471671.1 |
| C | Sim |
| XP_416724.2 |
| 93 | 88 | NW_001471534.1 |
| C | Trh |
| XP_421232.2 |
| 73 | n/m* | NW_001471710.1 |
| C | HIF |
| NP_989628.1 |
| 100 | 92 | NW_001471710.1 |
| C | HIF |
| NP_990138.1 |
| 100 | 91 | NW_001471679.1 |
| D | Emc |
| NP_989921.1 |
| 69 | n/m* | NW_001471567.1 |
| D | Emc |
| NP_990333.1 |
| 98 | 89 | NW_001471673.1 |
| D | Emc |
| NP_989920.1 |
| 100 | 96 | No clear |
| D | Emc |
| NP_989613.1 |
| 91 | 86 | NW_001471637.1 |
| E | Hey |
| XP_425926.2 |
| 97 | 89 | NW_001471651.1 |
| E | Hey |
| XP_419754.2 |
| 66 | 73 | NW_001471671.1 |
| E | H/E(spl) |
| hmm32419 |
| 82 | 80 | NW_001471443.1 |
| E | H/E(spl) |
| XP_422641.2 |
| n/m | n/m | NW_001471743.1 |
| E | H/E(spl) |
| XP_416543.2 |
| n/m | n/m | NW_001471526.1 |
| E | H/E(spl) |
| NP_001012713.1 |
| 75 | 78 | NW_001471571.1 |
| E | H/E(spl) |
| XP_417552.2 |
| n/m | 97 | NW_001471571.1 |
| E | H/E(spl) |
| XP_417553.2 |
| n/m | 97 | NW_001471571.1 |
| F | Coe |
| NP_990083.1 |
| 52 | n/m* | NW_001471449.1 |
| F | Coe |
| XP_417675.2 |
| 94 | 90 | NW_001471575.1 |
| F | Coe |
| XP_421824.2 |
| 67 | n/m* | NW_001471723.1 |
| ? | Orphan |
| XP_422318.1 |
| n/m | n/m | NW_001471740.1 |
| ? | Orphan |
| XP_001234727.1 |
| 100 | 93 | NW_001471567.1 |
| ? | Orphan |
| XP_001235101.1 |
| n/m | n/m | NW_001471508.1 |
Chicken bHLH genes were named according to their human homologues. Bootstrap values were from phylogenetic analyses with human bHLH sequences using Bayesian inference and ML algorithm, respectively. BI posterior probability (note a) refers the result from Bayesian inference in phylogenetic analysis, and ML bootstrap value (note b) refers the result from maximum likelihood estimate in phylogenetic analysis. The numbers in the phylogenetic trees are converted into percentages. All bHLH members are in the order of bHLH families manifested in Ledent et al. [5, Table 1]. All protein sequences were retrieved in NCBI website except those numbered beginning with “hmm” which were from database of “Ab initio protein”. The question mark means no matching, mark n/m means none monophyletic group with another single bHLH sequence of a known family, but formed a monophyletic group with two or more homologue sequences of the same family; n/m* denotes cases of lower bootstrap value estimated less than 50%.
Figure 1Alignment of the 104 chicken bHLH protein domains shaded using Genedoc. Designation of basic, helix 1, loop and helix 2 follows [1], and Ferre-D et al. [14]. Detailed information of the 104 chicken bHLH proteins was attached in Table 1.
Figure 2Chromosomal locations of chicken bHLH transcription factor genes. The chicken bHLH names in red are those of the same family cluster together. Family information of each bHLH gene is listed in Table 1.
A comparison of the number of bHLH factors among vertebrate and invertebrate species.
| Family | Group |
| Lancelet | Giant owl limpet | Chicken | Zebrafish | Rat | Mouse |
|---|---|---|---|---|---|---|---|---|
| ASCa | A | 4 | 3 | 6 | 2 | 2 | 2 | 2 |
| ASCb | A | 0 | 1 | 1 | 2 | 3 | 3 | 3 |
| MyoD | A | 1 | 4 | 1 | 4 | 4 | 4 | 4 |
| E12/E47 | A | 1 | 1 | 4 | 5 | 5 | 4 | 4 |
| Ngn | A | 1 | 1 | 3 | 2 | 2 | 3 | 3 |
| NeuroD | A | 0 | 1 | 1 | 3 | 5 | 4 | 4 |
| Atonal | A | 3 | 1 | 2 | 3 | 4 | 2 | 2 |
| Mist | A | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| Beta3 | A | 1 | 1 | 2 | 2 | 3 | 2 | 2 |
| Oligo | A | 0 | 2 | 3 | 2 | 4 | 3 | 3 |
| Net | A | 1 | 1 | 2 | 1 | 1 | 1 | 1 |
| Delilah | A | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| Mesp | A | 1 | 1 | 0 | 5 | 5 | 3 | 3 |
| Twist | A | 1 | 1 | 2 | 4 | 3 | 2 | 2 |
| Paraxis | A | 1 | 2 | 1 | 3 | 4 | 2 | 2 |
| MyoRa | A | 1 | 4 | 1 | 2 | 2 | 2 | 2 |
| MyoRb | A | 0 | 1 | 1 | 1 | 2 | 2 | 2 |
| Hand | A | 1 | 1 | 1 | 2 | 1 | 2 | 2 |
| PTFa | A | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| PTFb | A | 2 | 3 | 1 | 1 | 2 | 1 | 1 |
| SCL | A | 1 | 1 | 5 | 2 | 3 | 3 | 3 |
| NSCL | A | 1 | 1 | 1 | 2 | 1 | 2 | 2 |
| SRC | B | 1 | 1 | 0 | 3 | 3 | 3 | 3 |
| Fig | B | 0 | 1 | 0 | 0 | 1 | 1 | 1 |
| Myc | B | 1 | 1 | 1 | 3 | 6 | 4 | 4 |
| Mad | B | 0 | 1 | 1 | 3 | 4 | 4 | 4 |
| Mnt | B | 1 | 1 | 1 | 1 | 2 | 1 | 1 |
| Max | B | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| USF | B | 1 | 1 | 2 | 1 | 2 | 2 | 2 |
| MITF | B | 1 | 1 | 1 | 3 | 5 | 4 | 4 |
| SREBP | B | 1 | 1 | 1 | 2 | 2 | 2 | 2 |
| AP4 | B | 1 | 1 | 1 | 0 | 1 | 1 | 1 |
| MLX | B | 1 | 1 | 7 | 3 | 1 | 2 | 2 |
| TF4 | B | 1 | 0 | 1 | 1 | 1 | 1 | 1 |
| Clock | C | 3 | 1 | 2 | 3 | 3 | 2 | 2 |
| ARNT | C | 1 | 1 | 0 | 2 | 2 | 2 | 2 |
| Bmal | C | 1 | 1 | 0 | 2 | 2 | 2 | 2 |
| AHR | C | 2 | 1 | 1 | 3 | 4 | 2 | 2 |
| Sim | C | 1 | 1 | 1 | 2 | 2 | 2 | 2 |
| Trh | C | 1 | 1 | 0 | 1 | 2 | 1 | 1 |
| HIF | C | 1 | 1 | 1 | 2 | 6 | 4 | 4 |
| Emc | D | 1 | 1 | 2 | 4 | 5 | 4 | 4 |
| Hey | E | 1 | 1 | 1 | 2 | 4 | 4 | 4 |
| H/E(spl) | E | 11 | 11 | 12 | 6 | 15 | 8 | 8 |
| Coe | F | 1 | 1 | 1 | 3 | 5 | 4 | 4 |
| Orphan | ? | 0 | 6 | 4 | 3 | 2 | 4 | 4 |
|
| ||||||||
| Total | 59 | 78 | 82 | 104 | 139 | 114 | 114 | |
The vertebrate and invertebrate species referred lancelet (Branchiostoma floridae), giant owl limpet (Lottia gigantean), Drosophila (Drosophila melanogaster, fruit fly), zebrafish (Danio rerio), chicken (Gallus gallus), rat (Rattus norvegicus), and mouse (Mus musculus). Data on lancelet and Drosophila are from Simionato et al. [9]. Data on zebrafish, rat, and mouse are from Wang et al. [10] and Zheng et al. [11]. Data on giant owl limpet and chicken are from the findings of this study. Family names and group assignment followed Ledent et al. [5, Table 1].
Figure 3Phylogenetic tree of Hes homologues (hairy and enhancer of split) from human, mouse, rat, zebrafish, and chicken. A phylogenetic tree of Bayesian inference tree is shown. The zebrafish Heyl (hey-like) sequence was defined as the out-group. Figures around the node are Bayesian posterior probabilities of the corresponding branches. The Bayesian posterior probabilities were converted into percentages. The phylogenetic tree of Hes factor motifs revealed that Hes1, Hes2, Hes3, Hes5, Hes6, and Hes7 had their own common ancestor sequences, respectively.
Figure 4The top 51 GO terms frequency counts for chicken biological process. The bar plot indicates the numbers or frequencies of Gene Ontology (GO) terms we collected for a set of 89 biological process categories on the chicken bHLH proteins [15]. The top 51 GO annotation numbers counted more less than five were shown. Ambiguous GO terms of biology process subtypes, such as regulation of transcription, regulation of biological process, regulation of cellular process were excluded.