Literature DB >> 29901738

Identification and Phylogenetic Analysis of Basic Helix-Loop-Helix Genes in the Diamondback Moth.

Zhen Zeng1, Yong Wang2, Nana Adwoa Nkuma Johnson2, Guang-Dong Wang1, Qin Yao1, Ke-Ping Chen1.   

Abstract

Basic helix-loop-helix (bHLH) transcription factors play essential roles in regulating eukaryotic developmental and physiological processes such as neuron generation, myocyte formation, intestinal tissue development, and response to environmental stress. In this study, the diamondback moth, Plutella xylostella (L.) (Lepidoptera: Plutellidae), genome was found to encode 52 bHLH genes. All 52 P. xylostella bHLH (PxbHLH) genes were classified into correspondent bHLH families according to their orthology with bHLHs from fruit fly and other insect species. Among these 52 PxbHLH genes, 19 have been annotated consistently with our classification in GenBank database. The remaining 33 PxbHLH genes are either annotated as general bHLH genes or as hypothetical genes. Therefore, our data provide useful information for updating annotations to PxbHLH genes. P. xylostella has four stem cell leukemia (SCL) genes (one of them has three copies), two Dys genes, two copies of MyoR, Mitf, and Sima genes, and three copies of Sage genes. Further studies may be conducted to elucidate functions of these specific bHLH genes in regulating P. xylostella growth and development.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29901738      PMCID: PMC6007555          DOI: 10.1093/jisesa/iey057

Source DB:  PubMed          Journal:  J Insect Sci        ISSN: 1536-2442            Impact factor:   1.857


A basic helix-loop-helix (bHLH) motif is approximately 60 amino acids in length. It is composed of a basic (alkaline) region capable of binding DNA and a helix-loop-helix region capable of forming dimer with another HLH motif. Based on statistical analysis to amino acid composition in a large number of bHLH motifs, Atchley et al. (1999) discovered 19 highly conserved sites in bHLH motif at which specific amino acids are present. For example, either arginine or lysine is present at the first, second, and 10th site of the basic region. Therefore, a criterion was established to qualify a candidate bHLH protein sequence through examining whether specific amino acids are present at the 19 conserved sites. According to this criterion, a qualified bHLH protein sequence should have no less than 11 specific amino acids present at the 19 conserved sites. bHLH proteins constitute a large superfamily of transcription factors. Various bHLH proteins play significant regulatory roles in a wide range of eukaryotic developmental and physiological processes such as neuron generation, myocyte formation, intestinal tissue development, and response to environmental stress (Massari and Murre 2000). Various eukaryotic species have a greatly varied number of bHLH genes. For example, yeast, nematode, fruit fly, mouse, and zebrafish genomes were found to encode 8, 45, 59, 114, and 139 bHLH genes, respectively, while genomes of thale cress and rice were found to encode 147 and 167 bHLH genes, respectively (Robinson and Lopes 2000, Ledent et al. 2002, Bailey and Weisshaar 2003, Li et al. 2006, Simionato et al. 2007, Wang et al. 2009, Zheng et al. 2009). Animal bHLH proteins are currently classified into groups A, B, C, D, E, and F according to the nucleotide composition of target DNA elements they recognize and the common structural features they possess. Group A and B bHLH proteins recognize and bind DNA elements containing E box CANNTG (N means any nucleotide), which is CA(G/C)CTG for group A and CA(CG/TGT)TG for group B. Group C bHLH proteins recognize and bind DNA element containing (A/G)CGTG. Most group C proteins also contain a Per-Arnt-Sim (PAS) domain that facilitates dimerization with another PAS-containing protein (Jones 2004). Group D bHLH proteins have no basic region. They do not recognize any target DNA elements but form inactive heterodimers with group A bHLH proteins. Group E bHLH proteins recognize and bind DNA elements containing N box CACG(C/A)G. Their bHLH motifs are closely followed by a structural domain named ORANGE. Besides, a WRPW (tryptophan-arginine-proline-tryptophan) motif is present at their carboxyl terminus. Group F bHLH proteins do not have basic region. Instead, they have an IPT (immunoglobulin-like, plexins and transcription factor) structural domain to facilitate dimerization and target DNA binding (Ledent and Vervoort 2001). Animal bHLH proteins are also divided into 45 families according to their specific functions in regulating eukaryotic growth and development (Simionato et al. 2007). Insect bHLH genes are distributed into 42 families and each family has one or two members (Liu et al. 2015). Thus, an insect species generally has around 55 bHLH genes in its genome. Although total number of bHLH genes is close in different insects, number of bHLH genes in each family can be quite different. For example, Asian citrus psyllid has two to three bHLH genes in Net, Hand (heart and neural crest derivatives), and SRC (steroid receptor coactivator) families (Peng et al. 2017), while other insect species have only one gene in each of these families. Mosquitoes have three Ato genes in Atonal family (Zhang et al. 2013), while other insect species have only one Ato gene. Jewel wasp was found to lack Net, MyoR (myogenic repressor) and Fer1 (forty-eight related 1) genes (Liu et al. 2015), which are all present in other insect species. The presence or absence of specific bHLH genes may lead to physiological and behavioral difference among insect species, because each bHLH gene has its specific role(s) in controlling expression of genes related to organismal development. For example, Atonal family genes are involved in developmental regulation of Drosophila chordotonal organs and photoreceptors (Jarman et al. 1995). Hand and SRC family genes play roles in controlling Drosophila heart morphogenesis and larval metamorphosis (Han et al. 2006, Jang et al. 2009). Net and MyoR family genes are found to regulate Drosophila intervein and muscle development respectively (Georgias et al. 1997, Brentrup et al. 2000). The diamondback moth, Plutella xylostella (L.) (Lepidoptera: Plutellidae), is one of the most aggressive pests of brassica vegetables and oilseed crops (Zalucki et al. 2012, Furlong et al. 2013). The name diamondback moth is based on such fact that a few light-colored diamond shapes are present on posterior margins of its forewings (Adashkevich 1972). P. xylostella larvae feed on leaves of host plants from seedling stage, which greatly affects yield and quality of the crop (Furlong et al. 2013). Diamondback moths have very few natural enemies and strong resistance to various insecticides, including insecticidal toxins. Therefore, they are very hard to be controlled efficiently in field (Talekar and Shelton 1993). The annual cost for pest management against diamondback moth has reached more than US$1 billion in the world (Zalucki et al. 2012, Tian et al. 2013). Its resistance to over 79 insecticides and failure in establishing additional control measures has led to the inability of growing cruciferous crops in certain areas (Liang et al. 2001, Sun et al. 2012). In view of the importance of bHLH transcription factors in regulating insect tissue/organ development, knowledge of bHLH gene composition in P. xylostella would facilitate further studies on functions of specific bHLH proteins in regulating P. xylostella development and may aid in establishment of biological strategies to control its occurrence. Therefore, in the present study, we employed Blast searches and phylogenetic analyses to identify bHLH genes encoded in the genome of diamondback moth. A comparison with other insects displayed that P. xylostella has additional bHLH genes and/or gene copies in six bHLH families.

Materials and Methods

Data Collection

The amino acids of 45 representative bHLH motifs were prepared from previous report (Ledent and Vervoort 2001). Subsequently, they were used as query sequences to conduct Blastp searches for retrieving candidate bHLH protein sequences in diamondback moth at https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&BLAST_SPEC=OGP__51655__68127&LINK_LOC=blasttab. ‘Annotated proteins’ was selected as the target database and all other parameters were of default settings. As a result, a great number of P. xylostella protein sequences were obtained, which were then manually examined to remove the redundant ones. Amino acids of each P. xylostella bHLH (PxbHLH) motif were used to conduct tBlastn search against P. xylostella genome at https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=tblastn&PAGE_TYPE=BlastSearch&BLAST_SPEC=OGP__51655__68127&LINK_LOC=blasttab for obtaining contig number, reading frame and coding region(s) of each PxbHLH motif. ‘RefSeq Genomic’ was selected as target database, low complexity regions were not filtered, and other parameters were of default values.

Multiple Sequence Alignment

From the above Blastp and tBlastn searches, amino acid sequences of candidate P. xylostella bHLH motifs were obtained. Each of the obtained motifs was manually examined to confirm whether it has sufficient conserved amino acids. If 11 or more conserved amino acids are found at the 19 conserved sites as indicated by Atchley et al. (1999), the bHLH motif is considered as a potential bHLH family member. Because bHLH motifs of both groups D and F have no basic region and the typical group D and F bHLH motifs have only 33 and 45 amino acids, the number of minimum conserved amino acids to qualify a group D or F bHLH protein is reduced to 5 and 8, respectively. Amino acid sequences of all eligible bHLH motifs were aligned with MUSCLE (Edgar 2004) program which is embedded in MEGA 5.2 (Tamura et al. 2011) using default settings. The aligned P. xylostella bHLH (PxbHLH) motifs were saved in FASTA format and subsequently exported to GeneDoc (Edgar 2004) for displaying degrees of amino acid conservatism. The multiple sequence alignment was copied and saved as a rich text file for further annotations.

Phylogenetic Analysis

The qualified PxbHLH motifs from above examination were subject to phylogenetic analysis for determining their orthology with known bHLH family members. Our previous reports indicated that in-group phylogenetic analysis was efficient in determining whether two genes are orthologous (Wang et al. 2007, Wang et al. 2008). Briefly, this method is divided into two steps. Step 1, all the obtained PxbHLH sequences were used to construct a maximum likelihood (ML) phylogenetic tree in MEGA 5.2 together with 59 DmbHLH (Drosophila melanogaster (Meigen) (Diptera: Drosophilidae) bHLH) motifs. This tree was referenced for determining to which group a specific PxbHLH belongs. Step 2, a single PxbHLH motif was used to construct NJ (neighbor-joining), MP (maximum parsimony), and ML trees with DmbHLH motifs from the group determined in step 1. For example, if step 1 showed that a PxbHLH motif was located in the phyletic clade formed by DmbHLH motifs from group A, then step 2 used this PxbHLH motif to construct phylogenetic trees with DmbHLH motifs only from group A. In step 2, if a PxbHLH motif formed monophyletic clade with a specific DmbHLH motif and all bootstrap values supporting this monophyletic clade were higher than 50, this PxbHLH was determined as an ortholog of that specific DmbHLH sequence. If a PxbHLH motif could not form monophyletic clade with any DmbHLH motif or the formed monophyletic clade was supported by bootstrap values below 50, known bHLH motifs from other insect species were used to determine its orthology. Detailed steps for conducting in-group phylogenetic analysis are available in Liu et al. (2015).

Protein Functional Domain Prediction

Full-length PxbHLH protein sequences were retrieved from GenBank (https://www.ncbi.nlm.nih.gov/) using the correspondent protein accession numbers. The obtained protein sequences were then submitted to SMART (simple modular architecture research tool) (http://smart.embl-heidelberg.de/) for prediction of structural domain with default settings.

Results and Discussion

bHLH Family Members in P. xylostella

Through Blastp searches and manual examination, 52 PxbHLH (P. xylostella bHLH) family members were identified in P. xylostella protein databases (Fig. 1). Each identified PxbHLH motif contains more than 11 conserved amino acids, meaning that proteins containing these motifs are eligible bHLH proteins. Through in-group phylogenetic analyses, all identified PxbHLHs have been classified into correspondent bHLH families with bootstrap values higher than 50 (Table 1). Among them, 37 PxbHLH genes were classified and named according to their bHLH orthologs from fruit fly. The rest 15 PxbHLH genes were classified and named according to their bHLH orthologs from domestic silkworm (Bombyx mori (L.) (Lepidoptera: Bombycidae)), jewel wasp (Nasonia vitripennis) (Walker) ( Hymenoptera: Pteromalidae), or Asian citrus psyllid (Diaphorina citri (Kuwayama) (Hemiptera: Liviidae)). P. xylostella has two Ash2 and Dys genes and four SCL genes. These genes have been named as PxAsh2a and PxAsh2b, PxDys1 and PxDys2, and PxSCL1 to PxSCL4. Based on our classification, P. xylostella has 25, 9, 11, 1, 5, and 1 bHLH genes in groups A to F, respectively.
Fig. 1.

Multiple sequence alignment of 52 P. xylostella bHLH motifs. Basic, helix 1, loop, and helix 2 regions are delineated according to Ferre-D’Amare et al. (1993). Numbers below the delineation represent sites of amino acid residues. The conserved sites are marked with asterisks. Hyphens denote gaps. bHLH family and group names have been organized in accordance with Table 1.

Table 1.

A complete list of Plutella xylostella bHLH genes

bHLH familyFruit fly genePxbHLH geneBootstrap valuesProtein accession No.Annotation in GenBankGroup
NJMPML
ASCa Ase PxAse Bm 758596XP_011568337.1 ASC protein T8 A
Ash2 Bm PxAsh2a Bm 855687XP_011568336.1 ASC protein T3-like A
PxAsh2b Bm 855388XP_011552936.1 ASC protein T3-like A
Ash3 Bm PxAsh3 Bm 757986XP_011552937.1 ASC protein T3-like A
E12/E47 da PxDa 9910099XP_011562492.1 da A
MyoD nau PxNau 829597XP_011562883.1 MyoD1 A
Ngn tap PxTap 809193XP_011550236.1 TAP-like A
Mist dimm PxDimm 799497XP_011566759.1 Ngn2 A
Beta3 Oli PxOli 100100100XP_011552126.1 class E bHLH protein A
Atonal ato PxAto 729693XP_011563274.1 Atonal-like A
Net net PxNet 829594XP_011549879.1 atonal8 A
MyoRa MyoR PxMyoR 977688XP_011554754.1 scleraxis-like A
Delilah tx PxTx 619691XP_011566491.1 HLH protein delilah A
Mesp sage PxSage 799998XP_011555948.1 fer3-like protein A
Paraxis Pxs PxPxs 857789XP_011558845.1 transcription factor 15 A
Twist twi PxTwi 908893XP_011547896.1 twist-related protein 2 A
PTFa Fer1 PxFer1 928890XP_011554850.1 PTFa A
PTFb Fer2 PxFer2 849690XP_011561521.1 Tal protein 1 A
Fer3 PxFer3 999999XP_011554575.1 protein Fer3 A
Hand Hand PxHand 949486XP_011557047.1Hypothetical proteinA
SCL SCL PxSCL1 989699XP_011549084.1 atonal 7-B-like A
PxSCL2 Bm 817190XP_011568653.1 Tal protein 1 A
PxSCL3 Bm 846393XP_011551236.1Hypothetical proteinA
PxSCL4 Bm 817596XP_011568701.1Hypothetical proteinA
NSCL NSCL PxNSCL 929897XP_011557149.1 HLH protein 2 A
Mnt Mnt PxMnt 615378XP_011567464.1 MNT-like B
Max max PxMax 879791XP_011550502.1 protein max B
Mad Mad Nv PxMad Nv 889796XP_011551922.1 Mad 1-like B
Myc dm PxDm 9810091XP_011554509.1 Myc protein B
USF USF PxUsf 908496XP_011549434.1 USF2 B
MITF Mitf PxMitf 9110099XP_011566220.1 Mitf-like B
TF4 bmx PxBmx 929290XP_011552773.1 max-like protein X B
MLX Mio PxMio 748495XP_011548458.1 MLX-interacting protein B
SRC tai PxTai Bm 819895XP_011554670.1Hypothetical proteinB
Clock clk PxClk 949998XP_011553145.1 Clock C
Met PxMet Bm 859394XP_011557479.1 ARNT2 C
JHR Bm PxJHR Bm 96979719354544.pHypothetical proteinC
AHR Dys PxDys1 999999XP_011559031.1 NPAS4 C
PxDys2 999999XP_011556553.1 NPAS4 C
ss PxSs 8010097XP_011548350.1 AHR C
Sim Sim PxSim 96100100XP_011561807.1 Sim1-like C
Trh trh PxTrh 549796XP_011556672.1 protein Trh C
HIF sima PxSima 938695XP_011553430.1 HIF1α C
ARNT tgo PxTgo 75100100XP_011558891.1 ARNT C
Bmal cyc PxCyc Dc 655460XP_011557457.1 protein cycle C
Emc Emc PxEmc 809491XP_011568487.1 protein Emc D
Hey Hey PxHey 538188XP_011560171.1 Hey1 E
cwo PxCwo 879496XP_011562380.1 Hey E
H/E(spl) h PxH Bm 929198XP_011563228.1 hairy-like E
dpn PxDpn Bm 565272XP_011568811.1 Dpn-like E
E(spl)md PxE(spl)md Bm 825493XP_011568729.1Hypothetical proteinE
COE kn PxKn 929999XP_011563271.1 COE F

Each PxbHLH gene is named according to its ortholog of fruit fly (D. melanogaster) or other insects as indicated with superscript letters. Bootstrap values were from in-group phylogenetic analyses. For group B candidates, OsRa (the Oryza sativa bHLH motif sequence of R family) was used as outgroup. For group A and C–F candidates, DmMnt (a D. melanogaster bHLH motif sequence of B group) was used as outgroup. Superscript letters Bm, Dc, and Nv indicate gene orthology assignment using Bombyx mori (Bm), Diaphorina citri (Dc), and Nasonia vitripennis (Nv) bHLH motifs. In the last column, bold letters indicate consistent GenBank annotations with our classifications. Bold-italic letters indicate that GenBank annotations are based on family names which do not contain any information about its orthology with known insect bHLH gene. Italic letters indicate different GenBank annotations with our classification. Normal letters indicate hypothetical protein.

Multiple sequence alignment of 52 P. xylostella bHLH motifs. Basic, helix 1, loop, and helix 2 regions are delineated according to Ferre-D’Amare et al. (1993). Numbers below the delineation represent sites of amino acid residues. The conserved sites are marked with asterisks. Hyphens denote gaps. bHLH family and group names have been organized in accordance with Table 1. A complete list of Plutella xylostella bHLH genes Each PxbHLH gene is named according to its ortholog of fruit fly (D. melanogaster) or other insects as indicated with superscript letters. Bootstrap values were from in-group phylogenetic analyses. For group B candidates, OsRa (the Oryza sativa bHLH motif sequence of R family) was used as outgroup. For group A and C–F candidates, DmMnt (a D. melanogaster bHLH motif sequence of B group) was used as outgroup. Superscript letters Bm, Dc, and Nv indicate gene orthology assignment using Bombyx mori (Bm), Diaphorina citri (Dc), and Nasonia vitripennis (Nv) bHLH motifs. In the last column, bold letters indicate consistent GenBank annotations with our classifications. Bold-italic letters indicate that GenBank annotations are based on family names which do not contain any information about its orthology with known insect bHLH gene. Italic letters indicate different GenBank annotations with our classification. Normal letters indicate hypothetical protein. GenBank protein accession numbers and annotations for all 52 PxbHLH family members are listed (Table 1). A comparison between GenBank annotations and our classification shows that not all PxbHLH proteins have been annotated in agreement with our classification. Firstly, our classification of 19 PxbHLH proteins is consistent with GenBank annotations. For example, both GenBank annotation and our classification to protein No. XP_011562492.1 are da (daughterless). Secondly, GenBank annotations to 14 PxbHLH proteins are based on bHLH family names which do not contain any information about its orthology with known insect bHLH gene. For example, GenBank annotation to protein No. XP_011568337.1 is ASC (achaete-scute complex) protein T8, which is based on the family name ASC. Our classification to this protein is Ase (asense), which is a specific gene name in ASC family. Thus, our classification provides useful information for improving annotations to these 14 PxbHLH proteins. Thirdly, GenBank annotations to 13 PxbHLH proteins are different with our classification. For example, GenBank annotation to protein No. XP_011566759.1 is Ngn2 (neurogenin 2). It is Dimm (dimmed) in our classification. Our classification to each PxbHLH protein is based on in-group phylogenetic analysis supported by bootstrap values higher than 50, while GenBank annotation is mainly based on its sequence identity with known proteins. Thus our classification is considered to be more accurate than GenBank annotation. Finally, six PxbHLH proteins are annotated as hypothetical proteins in GenBank. They have been classified as specific bHLH genes by us. Thus, six new bHLH proteins are found in P. xylostella protein databases.

Structural Domains in PxbHLH Protein Sequences

Previous studies revealed that bHLH proteins of group C, E, and F usually possess typical conserved structural domains (Jones 2004). To further validate the reliability of our classification, we constructed an ML phylogenetic tree with the 52 PxbHLH motif sequences (Fig. 2, left panel) and predicted structural domains of PxbHLH proteins using SMART program (Fig. 2, right panel). Eleven PxbHLH proteins of group C have two PAS (Per-Arnt-Sim) domains and nine of them have a PAC (C-terminal to PAS motif) domain, while five members of group E have ORANGE domain, and PxKn protein of group F has three additional domains, viz. COE1 (collier/olfactory-1/early B-cell factor), IPT (immunoglobulin plexin transcription factor) and MSF1 (major facilitator superfamily 1) (Fig. 2). In summary, typical structural domains are present in PxbHLH proteins of groups C, E, and F respectively. Therefore, our classification to PxbHLH proteins of these groups is not only supported by in-group phylogenetic analysis with bootstrap values higher than 50 but also supported by presence of specific structural domains in these proteins.
Fig. 2.

Architecture of P. xylostella bHLH protein conserved domains. The left panel is a ML tree constructed using 52 PxbHLH motif amino acids with OsRa (the Oryza sativa bHLH motif sequence of R family) as outgroup. PxbHLH names of groups A to F are displayed in blue, red, green, purple, magenta, and aqua, respectively. The right panel is a schematic diagram showing HLH and other protein domains detected by SMART program online. Seven different protein domains, namely HLH, PAS, PAC, ORANGE, COE1, IPT, and MSF1, are found in P. xylostella bHLH proteins.

Architecture of P. xylostella bHLH protein conserved domains. The left panel is a ML tree constructed using 52 PxbHLH motif amino acids with OsRa (the Oryza sativa bHLH motif sequence of R family) as outgroup. PxbHLH names of groups A to F are displayed in blue, red, green, purple, magenta, and aqua, respectively. The right panel is a schematic diagram showing HLH and other protein domains detected by SMART program online. Seven different protein domains, namely HLH, PAS, PAC, ORANGE, COE1, IPT, and MSF1, are found in P. xylostella bHLH proteins. Among all PxbHLH proteins, PxHand has dual HLH motifs. Previously, four bHLH proteins, viz. Clk (clock), Sima (similar), Cyc (cycle), and Cwo (clockwork orange), were found to have dual HLH motifs in Asian citrus psyllid (Peng et al. 2017). No bHLH proteins have dual HLH motifs in jewel wasp, human body louse, and brown planthopper (Wang et al. 2014, Liu et al. 2015, Wan et al. 2016). In order to see whether other insect bHLH proteins have dual HLH motifs, full-length bHLH protein sequences of ten insect species, viz. fruit fly (D. melanogaster), yellow fever mosquito (Aedes aegypti (L.) (Diptera: Culicidae)), African malaria mosquito (Anopheles gambiae (Giles) (Diptera: Culicidae)), southern house mosquito (Culex quinquefasciatus (Jupp) (Diptera: Culicidae)), honey bee (Apis mellifera (L.) (Hymenoptera: Apidae)), Jerdon’s jumping ant (Harpegnathos saltator (Jerdon) (Hymenoptera: Formicidae)), domestic silkworm (B. mori), monarch butterfly (Danaus plexippus (L.) (Lepidoptera: Nymphalidae)), red flour beetle (Tribolium castaneum (Herbst) (Coleoptera: Tenebrionidae)) and pea aphid (Acyrthosiphon pisum (Harris) (Hemiptera: Aphididae)), were retrieved and analyzed using SMART program online. As a result, Hand protein of D. plexippus, Fer2 (forty-eight related 2), and Gce (germ cell-expressed) proteins of C. quinquefasciatus were found to have dual HLH motifs. Therefore, dual HLH motifs have been found in four insect species, among which P. xylostella and D. plexippus belong to Lepidoptera, C. quinquefasciatus belongs to Diptera and D. citri belongs to Hemiptera. In summary, dual HLH motifs exist in Hand protein of two Lepidopteran insects, P. xylostella and D. plexippus, but not in B. mori; Dual motifs were also found in two bHLH proteins of one Dipteran insect, C. quinquefasciatus, but not existed in other Dipteran insects, like D. melanogaster, A. aegypti and A. gambiae; The HLH dual motifs were also identified in four bHLH proteins of one Hemipteran insect, D. citri, but not found from two other Hemipteran insects, A. pisum and Nilaparvata lugens (Stal) (Hemiptera: Delphacidae). Because only a few dual HLH motifs are found in all bHLH proteins of fifteen insect species and dual HLH motifs are only shared by Hand protein of two Lepidopteran insect species, it is considered that these dual HLH motifs were not inherited from the common ancestor of insects. Instead, they were resulted from independent duplication of HLH-coding DNA segment in individual species or in specific lineage of insects.

Genomic Coding Regions of PxbHLH Motifs

The coding information of 52 PxbHLH motifs is listed in Table 2. Five PxbHLH genes were found to have multiple copies in P. xylostella genome. Among them, PxSage and PxSCL2 have three copies, while PxMyoR, PxMitf, and PxSima have two copies. Thirty-two PxbHLH motifs were found to have coding regions interrupted by introns. Among them, coding regions of PxMad, PxBmx, PxH, and PxDpn motifs are interrupted by two introns respectively, and each of the rest 28 PxbHLH motifs is interrupted by one intron respectively. A comparison with other insect species (Table 3) reveals that P. xylostella has the highest number of bHLH motifs having introns and the highest number of total introns. Besides, it occupies the first, second, and fourth place in length of the shortest intron, length of the longest intron and average length of introns, respectively. These data indicate that coding regions of PxbHLH motifs are interrupted by more and longer introns than most other insects. These data could have important implications for future studies concerning intron gain or loss events during bHLH gene evolution.
Table 2.

Coding regions, intron location and length of 52 P. xylostella bHLH motifs

FamilyPxbHLH nameGenomic coding sequence(s)Intron location and lengthGroup
Contig No.FrameCoding region(s)
ASCaPxAseNW_011952028.1+2991757–991951A
PxAsh2aNW_011952110.1+2351530–351721A
PxAsh2bNW_011952028.1−2848610–848419A
PxAsh3NW_011952110.1−2372924–372733A
E12/E47PxDaNW_011952010.1−3998879–998851Basic: 3026 bpA
−2995824–995692
MyoDPxNauNW_011952428.1−192621–92577Helix 1: 20855 bpA
−371721–71614
NgnPxTapNW_011952067.1−1273995–273837A
MistPxDimmNW_011952025.1−1397556–397494Helix 1: 308 bpA
−3397185–397087
Beta3PxOliNW_011952096.1−1760649–760602Helix 1: 400 bpA
−2760201–760085
AtonalPxAtoNW_011952451.1+2100991–101149A
NetPxNetNW_011952061.1−11139215–1139057A
MyoRaPxMyoRaNW_011952029.1−21464417–1464353Helix 1: 3828 bpA
−21460524–1460431
NW_011952149.1+381435–81499Helix 1: 222 bpA
+381722–81815
DelilahPxTxNW_011952025.1−21659310–1659134A
MespPxSageaNW_011952173.1+3428397–428527Helix 2: 662 bpA
+2429190–429220
NW_011952455.1+119441–19571Helix 2: 2834 bpA
+322406–22436
NW_011953665.1−21286–1156Helix 2: 708 bpA
−2447–417
ParaxisPxPxsNW_011952256.1+3333675–333718Helix 1: 387 bpA
+3334106–334220
TwistPxTwiNW_011952038.1+31035549–1035704A
PTFaPxFer1NW_011952151.1−298544–98386A
PTFbPxFer2NW_011952355.1+3145638–145681Helix 1: 1858 bpA
+1147540–147654
PxFer3NW_011952010.1+1371209–371286Helix 1: 645 bpA
+1371932–372012
HandPxHandNW_011952203.1+3535854–535967Helix 2: 1283 bpA
+2537251–537295
SCLPxSCL1NW_011952051.1−2276186–276047Helix 2: 868 bpA
−2275178–275160
PxSCL2aNW_011952031.1+2921275–921414Helix 2: 958 bpA
+3922373–922391
NW_011952031.1+1894211–894354Helix 2: 6842 bpA
+3901197–901211
NW_011952051.1−2260131–259992Helix 2: 634 bpA
−3259357–259339
PxSCL3NW_011952080.1+219034–19173Helix 2: 1431 bpA
+220605–20623
PxSCL4NW_011952031.1+2932273–932412Helix 2: 11849 bpA
+1944262–944280
NSCLPxNSCLNW_011952205.1−3245480–245322A
MntPxMntNW_011952044.1−3109144–108995Helix 2: 2566 bpB
−3106428–106420
MaxPxMaxNW_011952071.1−3651821–651717Loop: 761 bpB
−2650955–650902
MadPxMadNW_011952092.1−1980031–979997Basic: 155422 bpB
−2824574–824460Helix 2: 12791 bp
−1811676–811668
MycPxDmNW_011952144.1−2424292–424134B
USFPxUsfNW_011952056.1+2192728–192886B
MITFPxMitf aNW_011952746.1−310517–10496Basic: 1104 bpB
−39391–9234
NW_011952227.1−3360653–360632Basic: 340 bpB
−1360291–360134
TF4PxBmxNW_011952010.1+12035402–2035419Basic: 427 bpB
+22035847–2035960Helix 2: 579 bp
+22036540–2036578
MLXPxMioNW_011952044.1−11378695–1378585Loop: 1966 bpB
−21376618–1376565
SRCPxTaiNW_011952147.1+2498170–498177Basic: 419 bpB
+1498597–498750
ClockPxClkNW_011952113.1−1496558–496554Basic: 457 bpC
−3496096–495949
PxMetNW_011952215.1−3426431–426270C
PxJHRNW_011952043.1+31153950–1154111C
AHRPxDys1NW_011952261.1−1285508–285347C
PxDys2NW_011952189.1+395025–95186C
PxSsNW_011952011.1−12908495–2908334C
SimPxSimNW_011952370.1−365993–65832C
TrhPxTrhNW_011952193.1+1289018–289179C
HIFPxSimaaNW_011952120.1+1159094–159255C
NW_011952120.1−22937–2776C
ARNTPxTgoNW_011952257.1+1516712–516873C
BMALPxCycNW_011952214.1+2288044–288205C
EmcPxEmNW_011952029.1+11547437–1547535D
HeyPxHeyNW_011952303.1+2370700–370867E
PxCwoNW_011952021.1+31662000–1662167E
H/E(spl)PxHNW_011952447.1+3138696–138701Basic: 12055 bpE
+1150757–150852Loop: 274 bp
+2151127–151198
PxDpnNW_011952033.1+3189775–189780Basic: 437 bpE
+3190218–190313Loop: 372 bp
+3190686–190757
PxE(spl)mdNW_011952032.1+1187693–187788Loop: 561 bpE
+1188350–188427
COEPxKnNW_011952450.1+3187645–187733Helix 2: 566 bpF
+2188300–188344

Multiple copies of bHLH gene in P. xylostella genome.

Table 3.

Intron number and length in coding regions of insect bHLH motifs

Insect speciesNo. of bHLH motifs having intronsTotal no. of intronsLength of the shortest intron (bp)Length of the longest intron (bp)Average length of introns (bp)
Holometabola
Diptera
  Aedes aegypti (Aa)252836315 34416 707
  Anopheles gambiae (Ag)212357 37 485 2 279
  Culex quinquefasciatus (Cq)22245614 434 2 464
  Drosophila melanogaster (Dm)18205711 845 1 027
Hymenoptera
  Apis mellifera (Am)242972129 55811 020
  Nasonia vitripennis (Nv)222777174 32511 715
  Harpegnathos saltator (Hs)232782127 364 6 326
Lepidoptera
  Bombyx mori (Bm)24287811 651 1 749
  Danaus plexippus (Dp)253074 4 539 607
  Plutella xylostella (Px)3236222155 422 6 963
Coleoptera
  Tribolium castaneum (Tc)242944100 326 4 841
Paraneoptera
Hemiptera
  Diaphorina citri (Dc)23288268 654 6 759
  Acyrthosiphon pisum (Ap)28366230 718 4 003
  Nilaparvata lugens (Nl)23295814 128 2 736
Phthiraptera
 Pediculus humanus corporis (Phc)2227666 723 695
Average24287580 168 5 326

Insect species have been organized into two groups (i.e., Holometabola and Paraneoptera) under infraclass Neoptera of class Insecta. Data of P. xylostella are from this study. Data of Danaus plexippus are from our unpublished work. Data of Apis mellifera, Pediculus humanus corporis (Light) (Phthiraptera: Pediculidae), Diaphorina citri, Acyrthosiphon pisum, Harpegnathos saltator, Bombyx mori, Aedes aegypti, Anopheles gambiae, Nasonia vitripennis and Culex quinquefasciatus are from previous reports (Wang et al. 2007, Wang et al. 2008, Dang et al. 2011, Liu et al. 2012, Zhang et al. 2013, Wang et al. 2014, Liu et al. 2015, Peng et al. 2017). Data of Drosophila melanogaster, Nilaparvata lugens, and Tribolium castaneum are from our own survey based on reports of Simionato et al. (2007), Bitra et al (2009), and Wan et al (2016). The same sources of data are used in Table 4.

Coding regions, intron location and length of 52 P. xylostella bHLH motifs Multiple copies of bHLH gene in P. xylostella genome. Intron number and length in coding regions of insect bHLH motifs Insect species have been organized into two groups (i.e., Holometabola and Paraneoptera) under infraclass Neoptera of class Insecta. Data of P. xylostella are from this study. Data of Danaus plexippus are from our unpublished work. Data of Apis mellifera, Pediculus humanus corporis (Light) (Phthiraptera: Pediculidae), Diaphorina citri, Acyrthosiphon pisum, Harpegnathos saltator, Bombyx mori, Aedes aegypti, Anopheles gambiae, Nasonia vitripennis and Culex quinquefasciatus are from previous reports (Wang et al. 2007, Wang et al. 2008, Dang et al. 2011, Liu et al. 2012, Zhang et al. 2013, Wang et al. 2014, Liu et al. 2015, Peng et al. 2017). Data of Drosophila melanogaster, Nilaparvata lugens, and Tribolium castaneum are from our own survey based on reports of Simionato et al. (2007), Bitra et al (2009), and Wan et al (2016). The same sources of data are used in Table 4.
Table 4.

bHLH family members in 15 insect species

GroupbHLH familyHolometabolaParaneoptera
DipteraHymenopteraLepidopteraColaHemipteraPhtb
AaAgCqDmAmNvHsBmDpPxTcDcApNlPhc
AASCa323422244421022
ASCb101000000012101
E12/E47111111111111111
MyoD111111111110011
Ngn112111111110121
NeuroD110000000010000
Mist111122211112222
Beta3111111111111111
Atonal545333311133333
Olig000000000000000
Net111110111112111
MyoRa111110111111111
MyoRb000000000000000
Delilah111100011121131
Mesp111111111101111
Paraxis111111111110110
Twist111111212111111
PTFa121110111111111
PTFb222222222222212
Hand111111111113111
SCL111111111411111
NSCL111111111111111
BMnt111121211111111
Mad000001000110110
Max111111211111311
Myc111111111111111
USF111121211111111
MITF111111111110012
AP4111122211011121
TF4111111111111211
MLX111111111110111
SREBP112111111011131
Figa000000000000000
SRC111111111112111
CClock222322233322222
AHR222222233322222
Sim121111111111121
Trh112111111111111
HIF111111111111111
ARNT111121111111111
BMAL111111111111111
DEmc121111111111111
EHey333222222222322
H/E(spl)4441164655366768
FCOE111111111111121
Total555557595548565253525452556055

Uncertainty of classification that previously existed in families ASCb, Hey, and H/E(spl) has been eliminated through our in-depth phylogenetic analysis. Please refer to Table 3 for full names of individual insect species.

Col: Coleoptera.

Pht: Phthiraptera.

Special bHLH Genes in P. xylostella

Up to now, bHLH repertories have been established for 15 insect species. Their gene numbers in each bHLH family are listed in Table 4. A comparison with other insects reveals the existence of special bHLH genes in P. xylostella. bHLH family members in 15 insect species Uncertainty of classification that previously existed in families ASCb, Hey, and H/E(spl) has been eliminated through our in-depth phylogenetic analysis. Please refer to Table 3 for full names of individual insect species. Col: Coleoptera. Pht: Phthiraptera. Firstly, P. xylostella has four stem cell leukemia (SCL) genes among which PxSCL2 has three copies. The multiple genes were defined because they have the different amino acid sequences, while gene copies were named because these gene copies have the identical amino acid sequences. There is only one copy of SCL gene in all other insects whose bHLH repertoires have been established. Phylogenetic tree constructed using SCL bHLH motif amino acids of 15 insect species displays that the four PxSCL genes cluster in a separate clade, indicating that they are originated from species-specific gene duplication in P. xylostella (Fig. 3a). SCL gene was first discovered in a human leukemic stem-cell line (Begley et al. 1989). It is expressed in a number of cells including haematopoietic stem cells, megakaryocytic cells, progenitor cells, and committed erythroids. It plays a significant role in regulating the proliferation and differentiation of various hematopoietic cells (Begley et al. 1991, Green and Begley 1992, Curtis et al. 2012, Real et al. 2012). In D. melanogaster, restricted expression of SCL was observed in a subset of cells in the developing central nervous system (Varterasian et al. 1993). It would be interesting to study where and when the four PxSCL genes are expressed and what mechanisms are employed by PxSCL proteins to regulate growth and development in P. xylostella.
Fig. 3.

Evolutionary relationship among insect SCL and Dys genes. (a) A maximum-likelihood phylogenetic tree based on bHLH motif amino acids encoded by SCL genes of 15 insect species. Phylogenetic clades shown in thick lines indicate species-specific gene duplication in P. xylostella. (b) A maximum-likelihood phylogenetic tree based on bHLH motif amino acids encoded by Dys genes of 15 insect species. Phylogenetic clades shown in thick lines indicate lineage-specific gene expansion in Lepidoptera. Both trees have been rooted using the DmMnt (D. melanogaster Mnt) motif amino acids. Sequence names are indicated using a two-letter abbreviation of species name plus gene name. Please refer to Table 3 for full names of individual insect species.

Evolutionary relationship among insect SCL and Dys genes. (a) A maximum-likelihood phylogenetic tree based on bHLH motif amino acids encoded by SCL genes of 15 insect species. Phylogenetic clades shown in thick lines indicate species-specific gene duplication in P. xylostella. (b) A maximum-likelihood phylogenetic tree based on bHLH motif amino acids encoded by Dys genes of 15 insect species. Phylogenetic clades shown in thick lines indicate lineage-specific gene expansion in Lepidoptera. Both trees have been rooted using the DmMnt (D. melanogaster Mnt) motif amino acids. Sequence names are indicated using a two-letter abbreviation of species name plus gene name. Please refer to Table 3 for full names of individual insect species. Secondly, P. xylostella has two Dys genes. Among the 15 insect species, all three Lepidopteran species (i.e., B. mori, Danaus plexippus and P. xylostella) have two Dys genes, while other insects have only one Dys gene. A phylogenetic tree constructed using Dys bHLH motif amino acids of 15 insect species (Fig. 3b) shows that Dys1 and Dys2 genes of B. mori, D. plexippus, and P. xylostella are located in separate clades, respectively. Such phylogenetic pattern demonstrates that the double Dys genes are originated from lineage-specific gene expansion in Lepidoptera. In fruit fly, Dys (dysfusion) is responsible for regulating gene expression in tracheal fusion (Jiang and Crews 2007). It is also involved in the regulation of pro-apoptosis and head involution defective in tarsal joints (Iordanou et al. 2011). Because the basic regions of PxDys1 and PxDys2 have three different amino acids (Fig. 1), it is possible that PxDys1 and PxDys2 proteins recognize different target DNA elements and play different regulatory roles in trachea development of P. xylostella. Thirdly, P. xylostella has two copies of MyoR, Mitf, and Sima genes, respectively and three copies of Sage gene. Among them, the coding regions of PxMyoR, PxSage, and PxMitf bHLH motifs are interrupted by one intron of different length, respectively, suggesting that each gene copy has diverged slightly after it was duplicated. MyoR (myogenic repressor) gene is expressed in undifferentiated myoblasts and down-regulated in myoblast differentiation (Lu et al. 1999). Sage (salivary gland-expressed bHLH) protein can form dimer with Da (daughterless) protein, which is necessary to maintain expression of sens gene in embryonic salivary gland. The expression of sens gene can prevent apoptosis of salivary gland cells in embryos (Chandrasekaran and Beckendorf 2003). Mitf (microphthalmia transcription factor) gene is expressed during Drosophila embryonic development and in Drosophila eye-buds/antennae-buds (Hallsson et al. 2004). It is also involved in regulating lysosomal biogenesis and expression of multiple V-ATPase in D. melanogaster (Tognon et al. 2016). Sima (similar) and Tgo (tango) form a complex that activates the corresponding gene expression under hypoxic condition (Lavista-Llanos et al. 2002). Under hypoxic condition, Sima protein accumulates in Drosophila SL2 cells (Bacon et al. 1998). Taken together, MyoR, Sage, Mitf, and Sima genes are mainly involved in regulation of myoblast differentiation, sens gene expression, eye development, and gene expression under hypoxic condition. Further studies may be conducted to understand functions of these multiple copy genes in regulating growth and development of specific cells/tissues such as myoblasts and eye-buds in P. xylostella. Finally, it is to be noted that we have not found AP4 (activating element-binding protein 4) and SREBP (sterol regulatory element-binding protein) genes in P. xylostella, while all other 14 insects have one to three such genes. AP4 is a protein that binds to viral SV40 enhancer elements and activates viral late transcription (Mermod et al. 1988). In addition, AP4 can form a complex with geminin and negatively regulate its target gene in non-neuronal cells (Kim et al. 2006). SREBP is crucial to survival of Drosophila larvae. If this gene was deleted, Drosophila larva growth was severely blocked, larval growth was severely blocked and larvae died before 3rd instar molting (Kunte et al. 2006). In view of the importance of these two genes in regulating animal growth and development, P. xylostella seems unlikely to lack these two genes. It is probably because the genome database of P. xylostella is incomplete. Therefore, when the genome sequences of diamondback moth are further refined in the future, we would come to check these data again.
  50 in total

1.  Genome-wide analysis of basic/helix-loop-helix transcription factor family in rice and Arabidopsis.

Authors:  Xiaoxing Li; Xuepeng Duan; Haixiong Jiang; Yujin Sun; Yuanping Tang; Zheng Yuan; Jingkang Guo; Wanqi Liang; Liang Chen; Jingyuan Yin; Hong Ma; Jian Wang; Dabing Zhang
Journal:  Plant Physiol       Date:  2006-08       Impact factor: 8.340

2.  Recognition by Max of its cognate DNA through a dimeric b/HLH/Z domain.

Authors:  A R Ferré-D'Amaré; G C Prendergast; E B Ziff; S K Burley
Journal:  Nature       Date:  1993-05-06       Impact factor: 49.962

3.  Cross-resistance patterns and fitness in fufenozide-resistant diamondback moth, Plutella xylostella (Lepidoptera: Plutellidae).

Authors:  Jingyan Sun; Pei Liang; Xiwu Gao
Journal:  Pest Manag Sci       Date:  2011-08-19       Impact factor: 4.845

4.  Phylogenetic analyses of vector mosquito basic helix-loop-helix transcription factors.

Authors:  D B Zhang; Y Wang; A K Liu; X H Wang; C W Dang; Q Yao; K P Chen
Journal:  Insect Mol Biol       Date:  2013-08-01       Impact factor: 3.585

5.  SCL/TAL1 regulates hematopoietic specification from human embryonic stem cells.

Authors:  Pedro J Real; Gertrudis Ligero; Veronica Ayllon; Veronica Ramos-Mejia; Clara Bueno; Ivan Gutierrez-Aranda; Oscar Navarro-Montero; Majlinda Lako; Pablo Menendez
Journal:  Mol Ther       Date:  2012-04-10       Impact factor: 11.454

6.  Two new Drosophila genes related to human hematopoietic and neurogenic transcription factors.

Authors:  M Varterasian; S Lipkowitz; I Karsch-Mizrachi; B Paterson; I Kirsch
Journal:  Cell Growth Differ       Date:  1993-11

7.  Functional characterization of PAS and HES family bHLH transcription factors during the metamorphosis of the red flour beetle, Tribolium castaneum.

Authors:  Kavita Bitra; Anjiang Tan; Ashley Dowling; Subba R Palli
Journal:  Gene       Date:  2009-08-13       Impact factor: 3.688

8.  The basic helix-loop-helix leucine zipper transcription factor Mitf is conserved in Drosophila and functions in eye development.

Authors:  Jón H Hallsson; Benedikta S Haflidadóttir; Chad Stivers; Ward Odenwald; Heinz Arnheiter; Francesca Pignoni; Eiríkur Steingrímsson
Journal:  Genetics       Date:  2004-05       Impact factor: 4.562

9.  Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics.

Authors:  Elena Simionato; Valérie Ledent; Gemma Richards; Morgane Thomas-Chollier; Pierre Kerner; David Coornaert; Bernard M Degnan; Michel Vervoort
Journal:  BMC Evol Biol       Date:  2007-03-02       Impact factor: 3.260

10.  Border-cell migration requires integration of spatial and temporal signals by the BTB protein Abrupt.

Authors:  Anna C-C Jang; Yu-Chiuan Chang; Jianwu Bai; Denise Montell
Journal:  Nat Cell Biol       Date:  2009-04-06       Impact factor: 28.824

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.