Literature DB >> 22532789

Characterization of the complete mitochondrial genomes of Cnaphalocrocis medinalis and Chilo suppressalis (Lepidoptera: Pyralidae).

Huan-Na Chai1, Yu-Zhou Du, Bao-Ping Zhai.   

Abstract

The complete mitochondrial genomes (mitogenomes) of Cnaphalocrocis medinalis and Chilo suppressalis (Lepidoptera: Pyralidae) were determined and analyzed. The circular genomes were 15,388 bp long for C. medinalis and 15,395 bp long for C. suppressalis. Both mitogenomes contained 37 genes, with gene order similar to that of other lepidopterans. Notably, 12 protein-coding genes (PCGs) utilized the standard ATN, but the cox1 gene used CGA as the initiation codon; the cox1, cox2, and nad4 genes in the two mitogenomes had the truncated termination codons T, T, and TA, respectively, but the nad5 gene was found to use T as the termination codon only in the C. medinalis mitogenome. Additionally, the codon distribution and Relative Synonymous Codon Usage of the 13 PCGs in the C. medinalis mitogenome were very different from those in other pyralid moth mitogenomes. Most of the tRNA genes had typical cloverleaf secondary structures. However, the dihydrouridine (DHU) arm of the trnS1(AGN) gene did not form a stable stem-loop structure. Forty-nine helices in six domains, and 33 helices in three domains were present in the secondary structures of the rrnL and rrnS genes of the two mitogenomes, respectively. There were four major intergenic spacers, except for the A+T-rich region, spanning at least 12 bp in the two mitogenomes. The A+T-rich region contained an 'ATAGT(A)'-like motif followed by a poly-T stretch in the two mitogenomes. In addition, there were a potential stem-loop structure, a duplicated 25-bp repeat element, and a microsatellite '(TA)(13)' observed in the A+T-rich region of the C. medinalis mitogenome. A poly-T motif, a duplicated 31-bp repeat element, and a 19-bp triplication were found in the C. suppressalis mitogenome. However, there are many differences in the A+T-rich regions between the C. suppressalis mitogenome sequence in the present study and previous reports. Finally, the phylogenetic relationships of these insects were reconstructed based on amino acid sequences of mitochondrial 13 PCGs using Bayesian inference and maximum likelihood methods. These molecular-based phylogenies support the traditional morphologically based view of relationships within the Pyralidae.

Entities:  

Keywords:  Chilo suppressalis; Cnaphalocrocis medinalis; Lepidoptera; Mitochondrial genome; Pyralidae; phylogenetic relationship.

Mesh:

Year:  2012        PMID: 22532789      PMCID: PMC3334671          DOI: 10.7150/ijbs.3540

Source DB:  PubMed          Journal:  Int J Biol Sci        ISSN: 1449-2288            Impact factor:   6.580


Introduction

The mitochondrial genome (mitogenome), which is responsible for the oxidative reactions of the tricarboxylic acid cycle, as well as electron transfer and energy metabolism in cells, forms a unit of genetic information, evolving independently from the nuclear genome 1. The mitogenome is characterized by its small size, maternal inheritance, stable and relatively short circular structure, lack of recombination, and frequent polymorphisms in most cells 2, 3. In addition, it can provide sets of genome-level characters, such as the relative position of different genes, RNA secondary structures and modes of control of replication and transcription 4. Thus, the mitogenome has been widely used as an informative molecular marker for diverse evolutionary studies of animals in the past several decades, including phylogenetics and population genetics 5, 6. The insect mitogenome is a closed-circular duplex molecule, ranging from 13 to 20 kb in length, containing 13 protein-coding genes [PCGs: two subunits of the ATPase (atp6 and atp8), three cytochrome coxidase subunits (cox1, cox2, and cox3), one cytochrome B (cob), seven NADH dehydrogenase subunits (nad1, nad2, nad3, nad4, nad5, nad6, and nad4L), two ribosomal RNA genes (rRNAs: rrnL and rrnS), and 22 transfer RNA genes (tRNAs)]. Additionally, it has at least one sequence known in insect mitogenomes as the A+T-rich region, which includes some initiation sites for transcription and replication of the genome 7, 8. The length of this region is highly variable in insects, due to the indels and the presence of variable copy numbers of tandem repeat elements 9. To date, complete or near-complete mitogenomes have been sequenced from more than 241 species of insects. However, only 36 complete or near-complete mitogenomes are available for Lepidoptera, a large group of some 200,000 species. Sequences are only available from species in the Bombycoidea, Geometroidea, Papilionoidea, Noctuoidea, Tortricoidea, and Pyraloidea. Among the Pyraloidea, sequenced mitogenomes are available for six species: Cnaphalocrocis medinalis, Chilo suppressalis, Diatraea saccharalis, Maruca vitrata, Ostrinia furnacalis, and Ostrinia nubilalis 10-13. However, data for three of these pyraloids, M. vitrata, O. furnacalis and O. nubilalis, lack some sequence information. Additional sequenced mitogenomes from the Lepidoptera can provide more detail about molecular phylogenetics of this important group. The Asiatic leaf roller, C. medinalis, and the rice stem borer, C. suppressalis, are both well-known rice pests widely distributed in the main rice-growing regions of China, Japan, Korea, India, Australia, and other countries. In China, these rice pests are found from Heilongjiang to Taiwan and Hainan, and have caused serious yield losses in recent decades 14, 15. In this study, we report the complete mitogenomes of C. medinalis and C. suppressalis and give a thorough description of their genome features, including gene order, nucleotide composition of protein-coding genes, secondary structures of tRNA and rRNA genes, and the A+T-rich region. In addition, we compare the C. suppressalis mitogenome sequence from this study with one from a previous study. Detailed genetic information on these two important rice pests may help in the development of methods for their control or prevention. Furthermore, information on C. medinalis, a migratory species, may provide insights into the energy supply mechanism used in its migration.

Materials and Methods

DNA sample extraction

Adults of C. medinalis and C. suppressalis were collected from rice paddies in Yangzhou (32o40.025N, 119o44.017E), Jiangsu Province, China, in July-October 2010. Samples were preserved in 100% ethanol and stored at -70oC until DNA extraction was performed. Whole genomic DNA was extracted from a single sample using the protocol of DNAVzol (Bioteke, Beijing, China) and then used for the PCR amplification.

PCR amplification, cloning and sequencing

The DNA samples from C. medinalis and C. suppressalis were amplified based on the primers from D. saccharalis, M. vitrata, O. furnacalis, O. nubilalis, and other lepidopteran sequences available in GenBank 11-13. All these primer pairs were designed using Primer Premier 5.0 software. Conditions for PCR amplification were as follows: an initial denaturation for 5 min at 95 oC, followed by 35 cycles of denaturation for 1 min at 94 oC, annealing for 1 min at 45-50 oC, elongation for 1-3 min (depending on putative length of the fragments) at 68 oC, and a final extension step of 72 oC for 10 min. LA Taq polymerase (TaKaRa, Dalian, China) was used in PCR amplification, except for fragments less than 1.3 kb, which were amplified with Taq polymerase (TaKaRa) instead. All PCR reactions were performed in an ABI thermal cycler (PE Applied Biosystems, San Francisco, CA, USA). PCR products were resolved by electrophoresis in a 1.0% agarose gel and extracted using a DNA Gel Extraction Kit (Bioteke, Beijing, China). Purified PCR products were ligated into T-vector (TaKaRa) before transformation into DH5α competent Escherichia coli cells. The positive recombinant clone was sequenced using upstream and downstream primers from both directions on ABI 3730XL Genetic Analyzer (PE Applied Biosystems) at least three times. The number of clones sequenced provided six-fold coverage of the mitogenome.

Genome annotation and secondary structures prediction

Protein-coding genes and rRNA genes were identified by comparison with other lepidopteran species previously sequenced 11-13. Protein-coding genes were aligned using Clustal X version 2.0 16. Composition skew analysis was carried out with formulas AT skew = [A-T]/[A+T] and GC skew = [G-C]/[G+C], respectively 17. The PCG nucleotide sequences without start and termination codons were translated on the basis of the Invertebrate Mitochondrial Genetic Code. The A+T content and codon usage were calculated using MEGA version 4.0 18. Transfer RNA genes were identified using tRNAscan-SE software available online at http://lowelab.ucsc.edu/tRNAscan-SE/ 19, and XRNA 1.2.0b was used to draw the secondary structure of tRNAs. The secondary structures of rrnL and rrnS genes were inferred based on models developed for other insect species 20-22. To infer the secondary structures of tRNA and rRNA genes, we used a commonly accepted comparative approach to correct for unusual pairings with RNA-editing mechanisms that are well known in arthropod mitogenomes 20, 23. A compensatory change was first defined as two substitutions occurring sequentially that maintained base pairing in a given position of a helix, then two or more Watson-Crick (or G-U) interactions at the same location in a putative helix indicated that the Watson-Crick process allowed to preserve correct base pairing in such a way that the structure conformed to the helical model. 24. The entire A+T-rich region was subjected to a search for tandem repeats using the Tandem Repeats Finder program 25.

Phylogenetic analysis

The 34 complete or near-complete lepidopteran mitogenomes were downloaded from Genbank as references to determine phylogenetic relationships within this set of pyralids, using the mitogenome of Drosophila melanogaster as the outgroup 33 (Table 1). The amino acid sequences of each of the 13 PCGs were aligned with Clustal X version 2.0 using default settings and concatenation 16. The concatenated set of amino acid sequences from all 13 PCGs was then used for phylogenetic analyses. The best-fitting model, by Modeltest 48 using likelihood ratio tests, was then used to perform Bayesian inferences (BI) and maximum likelihood (ML) analysis using the program MrBayes 3.1.2 (http://morphbank.ebc.uu.SE/mrbayes/) 49 and a PHYML online web server 50. The BI analyses were conducted under the following conditions: 1,000,000 generations, four chains (one cold chain and three hot chains) and a burn-in step for the first 10,000 generations. The confidence values of the BI tree were expressed as the Bayesian posterior probabilities in percentages. The ML analysis was conducted using the proportion of invariable sites as “estimated,” the number of substitution rate categories as four, the gamma distribution parameter as “estimated,” and the starting tree as a BIONJ distance-based tree. The confidence values of the ML tree were evaluated via a bootstrap test with 100 iterations.
Table 1

Source and information for the polygenomic analysis.

SurperfamilyFamilySpeciesLength /bpAccession NumberReferences
BombycoideaSaturniidaeAntheraea pernyi15575AY24299626
SaturniidaeAntheraea yamamai15338EU72663027
BombycidaeBombyx mandarina15717FJ38479628
BombycidaeBombyx mori15664AY04818729
SaturniidaeEriogyna pyretorum15327FJ68565330
SphingidaeManduca sexta15516EU28678531
SaturniidaeSaturnia boisduvalii15360EF62222732
DrosophiloideaDrosophilidaeDrosophila melanogaster19517U3754133
GeometroideaGeometridaePhthonandria atrilineata15499EU56976434
NoctuoideaNoctuidaeHelicoverpa armigera15347GU18827335
HypercompeHyphantria cunea15481GU59204936
Lymantridae;Lymantria dispar15569FJ61724037
NotodontidaeOchrogaster lunifer15593AM94660138
NoctuidaeSesamia inferens15413JN039362Unpublished
PapilionoideaAcraeidaeAcraea issoria15245GQ37619539
PieridaeArtogeia melete15140EU59712440
NymphalidaeApatura metis15236JF801742Unpublished
NymphalidaeCalinaga davidis15267HQ65814341
LycaenidaeCoreana raphaelis15314DQ10270342
NymphalidaeHipparchia autonoe15489GQ86870743
PapilionodaeLuehdorfia chinensis13860EU622524Unpublished
PapilionodaePapilio maraho16094FJ810212Unpublished
ParnassiidaeParnassius bremeri15389FJ87112544
PapilionodaePapilio xuthus13964EF62172445
NymphalidaeSasakia charonda15244AP011824Unpublished
NymphalidaeSasakia charonda uriyamaensis15222AP011825Unpublished
PapilionodaeTeinopalpus aureus15242HM563681Unpublished
PapilionodaeTroides aeacus15263EU625344Unpublished
PyraloideaPyralidaeChilo suppressalis15395JF339041Unpublished
PyralidaeCnaphalocrocis medinalis15388JN246082Unpublished
PyralidaeDiatraea saccharalis15490FJ24022711
PyralidaeMaruca vitrata14054HM75115012
PyralidaeOstrinia furnacalis14536AF46726013
PyralidaeOstrinia nubilalis14535AF44295713
TortricoideaTortricidaeAdoxophyes honmai15680DQ07391646
OlethreutidaeGrapholita molesta15717HQ39251124
OlethreutidaeSpilonota lechriaspis15368HM20470547

Results

Genome structure and base composition

The complete mitogenomes of C. medinalis and C. suppressalis were found to be circular molecules 15,388 bp and 15,395 bp in size, respectively (Fig. 1). The sequences were submitted to GenBank under the accession numbers JN246082 (C. medinalis) and JF339041 (C. suppressalis). Both mitogenomes contained 37 typical mitochondrial genes (13 PCGs, 22 tRNA genes, and two rRNA genes), of which 23 were transcribed on the majority-coding strand (H-strand), and the remaining were transcribed on the minority-coding strand (L-strand) (Fig. 1 and Table 2).
Fig 1

Map of mitogenomes of Cnaphalocrocis medinalis and Chilo suppressalis. Transfer RNA genes are designated by single-letter amino acid codes. CR indicates A+T-rich region. Gene name without underline indicates the direction of transcription from left to right, and with underline indicates right to left.

Table 2

Annotion of the mitogenomes of Cnaphalocrocis medinalis (Cm) and Chilo suppressalis (Cs).

GeneDirectionRegionSize (bp)IGNStart codonStop codon
CmCsCmCsCmCsCmCsCmCs
trnMF1..681..6868680-1----
trnIF69..13368..1336566-30----
trnQR131..199134..20269697452----
nad2F274..1290255..12681017101457ATTATTTAATAA
trnWF1296..13631276..13436868-8-8----
trnCR1356..14231336..14026867530----
trnYR1429..14951433..15006768416----
cox1F1537..30671507..30371531153130CGACGAT-T-
trnL2(UUR)F3068..31343038..3104676700----
cox2F3135..38163105..378668268200ATGATAT-T-
trnKF3817..38873787..38577171110----
trnDF3899..39643858..3925666800----
atp8F3965..41323926..4090168165-7-7ATTATCTAATAA
atp6F4126..48004084..4764675681-1-1ATGATGTAATAA
cox3F4800..55884764..555278978922ATGATGTAATAA
trnGF5591..56555555..5621656700----
nad3F5656..60095622..597535435446101ATTATTTAATAA
trnAF6056..61216077..614466681-1----
trnRF6123..61896144..621267690-1----
trnNF6190..62566212..6278676721----
trnS1(AGN)F6259..63246280..6345666610----
trnEF6326..63936346..64146869-20----
trnFR6392..64606415.. 648169670-17----
nad5R6461..81956465.. 82161735175200ATTATTT-TAA
trnHR8196..82618217.. 8281666500----
nad4R8262..96018282.. 962113401340-17ATGATGTA-TA-
nad4LR9601..98949629.. 992229429422ATGATGTAATAA
trnTF9897..99619925.. 9989656500----
trnPR9962..100279990..10055666622----
nad6F10030..1056610058..105975375406-1ATTATTTAATAA
cobF10573..1172110597..117421149114613ATGATGTAATAA
trnS2(UCN)F11723..1179011746..1181168661612----
nad1R11807..1274511824..1276593994211ATGATGTAATAA
trnL1(CUN)R12747..1281312767..128336767-1-25----
rrnLR12814..1420212809..141911384138300----
trnVR14203..1426814192..14259666800----
rrnSR14269..1504914260..1504778178800----
A+T-rich region15050..1538815048..1539533934800----

IGN indicates intergenic nucleotide.

The nucleotide compositions of the whole mitogenome of C. medinalis and C. suppressalis were as follows: (A) 40.3%, 40.6%; (T) 41.6%, 40.0%; (G) 7.5%, 7.5%; and (C) 10.6%, 11.9%, respectively. The whole mitogenome of C. medinalis was biased towards AT nucleotides (81.9%), similar to that of C. suppressalis (80.6%) (Table 3). Furthermore, GC% content as well as AT- and GC-skews were calculated for the mitogenomes of C. medinalis and C. suppressalis (Table 3). The results showed that the AT-skew of the C. medinalis mitogenome was -0.016 and was biased to use T rather than A, whereas the C. suppressalis mitogenome exhibited an AT-skew of 0.007. Meanwhile, GC-skews were similar in both the C. medinalis (-0.171) and C. suppressalis (-0.227) mitogenomes (Table 3).
Table 3

Skewed nucleotide composition in regions of Cnaphalocrocis medinalis (Cm) and Chilo suppressalis (Cs) mitogenomes.

RegionA/%G/%T/%C/%A+T/%G+C/%AT skew/%GC skew/%
CmCsCmCsCmCsCmCsCmCsCmCsCmCsCmCs
Whole mitogenome40.340.67.57.541.640.010.611.981.980.618.119.4-0.0160.007-0.171-0.227
PCGs34.733.910.210.845.845.09.310.380.578.919.521.2-0.138-0.1400.0460.024
1st codon position37.122.215.913.137.548.49.516.374.670.625.429.4-0.005-0.3710.252-0.109
2nd codon position22.342.413.03.248.750.016.04.471.092.429.07.6-0.372-0.082-0.103-0.158
3rd codon position44.837.01.716.151.236.62.310.396.073.64.026.4-0.0670.005-0.1500.220
tRNAs41.441.410.210.341.040.67.47.782.482.017.618.00.0050.0210.1590.144
rRNAs43.943.79.610.141.541.35.05.085.485.014.615.10.0280.0280.3150.338
A+T-rich region42.542.20.880.353.453.03.24.895.995.24.15.1-0.114-0.113-0.566-0.882

Protein-coding genes

The protein-coding gene (PCG) regions of the C. medinalis and C. suppressalis mitogenomes were consistent with those of other pyralid moth mitogenomes, both containing 13 PCGs (Table 2). All the genes were not coded by the same strand, but rather nine PCGs (nad2, cox1, cox2, atp8, atp6, cox3, nad3, nad6, and cob) were coded by the H-strand, while the remaining four PCGs (nad1, nad4, nad4L, nad5) were coded by the L-strand in both the C. medinalis and C. suppressalis mitogenomes. The start and stop codons of the 13 PCGs are shown in Table 2, with 12 PCGs utilizing the standard ATN (ATA, ATC, ATG, and ATT) as observed in invertebrate mitogenomes 51. However, the cox1 gene used CGA as the initiation codon in C. medinalis and C. suppressalis mitogenomes. Notably, the cox2 and atp8 genes started with ATG and ATT, respectively, in the C. medinalis mitogenome, whereas ATA were found to initiate the cox2 gene, and the atp8 gene started with ATC in the C. suppressalis mitogenome. In the C. medinalis mitogenome, nine PCGs were terminated by the standard stop codon TAA, whereas the cox1, cox2 and nad5 genes used T, and the nad4 gene utilized TA as a truncated stop codon. However, in the C. suppressalis mitogenome, there were 10 PCGs with the standard stop codon TAA, while the cox1 and cox2 genes used T and the nad4 gene used TA as truncated termination codons. The average A+T contents of PCGs (without start and stop codons) in the C. medinalis and C. suppressalis mitogenomes were 80.5 and 78.9%, respectively. The AT- and GC-skew values were calculated to analyze the AT and GC bias of the PCGs. The results demonstrated that AT-skews were negative (-0.138; -0.140), but the GC-skews were slightly positive (0.046; 0.024). The A+T content of the third codon position (96.0%) in the C. medinalis mitogenome was higher than that of the first (74.6%) and second (71.0%) positions. However, the A+T content of the second codon position (92.4%) in C. suppressalis mitogenome was much higher than that of the first (70.6%) and third (73.6%) positions. Excluding the start and termination codons, the 13 PCGs in the C. medinalis mitogenome consisted of 3,713 codons, similar to C. suppressalis (3,720). The behavior of codon families in the PCGs was determined (Figs. 2 and 3), with the start and stop codons excluded from the analysis to avoid biases due to unusual putative start and stop codons. The codon families were very different between the C. medinalis and C. suppressalis mitogenomes (Figs. 2 and 3).
Fig 2

Codon distribution in pyralid moth mitogenomes. Numbers to the left refer to the total number of codon. CDsp T, codons per thousand codons. Codon families are given on the x axis.

Fig 3

Relative Synonymous Codon Usage (RSCU) in pyralid moth mitogenomes. Codon families are given on the x axis. Codons that are not present in the genome are indicated in red.

Asn, Ile, Leu, Phe, and Ser were the most abundant amino acids in D. saccharalis, M. vitrata, O. furnacalis and O. nubilalis analyzed 11-13. However, the five most common codon families (Ile, Leu2, Met, Phe, and Tyr), each with at least 50 codons (CDs) per thousand CDs, were two-fold degenerate in codon usage and were rich in A and T in the C. medinalis mitogenome. Only Leu and Phe units with at least 100 CDs were found in the C. medinalis mitogenome, whereas Ile and Leu were found in C. suppressalis and other pyralid moth mitogenomes 11-13. Nine codon families encoding Ala, Arg, Cys, Gly, Pro, Ser1, Ser2, Thr, and Trp displayed no more than 12 CDs in C. medinalis, lower than in C. suppressalis and other pyralids 11-13 (Fig. 2). In addition, the AT-rich CDs were favored over synonymous CDs with a lower A+T content, which was consistent with the results of Relative Synonymous Codon Usage (RSCU), especially for the Leu2 family, where the TTA codon accounted for the large majority of CDs in the family. Two- and four-fold degenerate codon usages were biased to use more A and T than G and C in the third position. The lost codons usually belonged to GC-rich codon-families (Fig. 3).

Transfer RNA genes

Both the C. medinalis and C. suppressalis mitogenomes contained the set of 22 tRNAs found in other pyralid moth mitogenomes that are typical of animal mitogenomes 1. Twenty-two tRNA genes (totaling 1,474 bp and 1,482 bp and ranging in size from 65 to 71 bp) were interspersed with rRNAs or PCGs and had 82.4 and 82.0% A+T content in the C. medinalis and C. suppressalis mitogenomes, respectively. Fourteen tRNAs were codified by the H-strand and eight by the L-strand. All tRNA genes had the typical cloverleaf secondary structures with respective anticodons, except for the trnS1(AGN) gene, in which a simple loop was substituted for a dihydrouridine (DHU) arm in the C. medinalis and C. suppressalis mitogenomes (Fig. 4).
Fig 4

Inferred secondary structures for 22 typical tRNAs of the Cnaphalocrocis medinalis mitogenome. The tRNAs are labeled with the abbreviations of their corresponding amino acids. Base-pairing is indicated as follows: Watson-Crick pairs by lines, wobble GU pairs by dots and other non-canonical pairs by circles.

The tRNA genes usually contained a 7-bp amino acid acceptor (AA) stem, where most nucleotide substitutions were compensatory. The anticodon (AC) stem and the loop (7 bp) were both conserved in all tRNAs, where two U-U pairs were usually located at the second and third couplets in the stem. The length of DHU was 3-4 bp, except for trnS1(AGN). The TΨC arm was usually 3-6 bp in length. Compared to other lepidopteran species, there were 24 mismatched base pairs in 16 tRNAs secondary structures in the mitogenome of C. medinalis, including 15 G-U pairs, seven U-U pairs, one A-C pair, and one C-U pair. However, a total of 23 mismatched base pairs existed in the 16 C. suppressalis mitochondrial tRNA secondary structures, with 16 G-U pairs, four U-U pairs, one A-A pair, one A-C pair, and one C-C pair (Supplementary Material: Fig. S1).

Ribosomal RNA genes

As in the mitogenomes of other insects, there were two ribosomal genes, a 1,384 bp/1,383 bp rrnL gene and a 781 bp/788 bp rrnS gene in the C. medinalis and C. suppressalis mitogenomes, respectively. The locations of rRNAs in the two species were the same as in other pyralid moth mitogenomes. The rrnL gene resided between trnL1(CUN) and trnV, and the rrnS gene between trnV and the A+T-rich region. Both of the secondary structures of the rrnL and rrnS genes were inferred from models proposed for other insects 20-22, 31. Six domains with 49 helices were present in the C. medinalis rrnL and C. suppressalis rrnL genes (Supplementary Material: Fig. S2) as in Apis mellifera 21 and Manduca sexta 31 (Fig. 5). Meanwhile, there were 33 helices in the C. medinalis rrnS and C. suppressalis rrnS genes (Supplementary Material: Fig. S3) in three domains (labeled I, II, III) (Fig. 6), which were largely in agreement with those proposed for A. mellifera, M. sexta, and other insect species 20-22, 31.
Fig 5

Predicted secondary structure of the rrnL gene in the Cnaphalocrocis medinalis mitogenome. Tertiary interactions and base triples are shown connected by continuous lines. A represents the 5' half of rrnL, with the remaining 3' half in B. Base-pairing is indicated as follows: Watson-Crick pairs by lines, wobble GU pairs by dots and other non-canonical pairs by circles.

Fig 6

Predicted secondary structure of the rrnS gene in the Cnaphalocrocis medinalis mitogenome. Tertiary interactions and base triples are shown connected by continuous lines. Base-pairing is indicated as follows: Watson-Crick pairs by lines, wobble GU pairs by dots and other non-canonical pairs by circles.

Non-coding and overlapping regions

The mitogenomes of C. medinalis and C. suppressalis contained 219 bp and 221 bp of intergenic spacer sequences, respectively (Table 2). Additionally, there were four major intergenic spacers (S1, S2, S3, and S4) with at least 12 bp in C. medinalis and C. suppressalis, all of which were rich in A and T. The features of the S1-S4 spacers are illustrated below (Fig. 7).
Fig 7

Intergenic spacer sequences in the mitogenomes of Cnaphalocrocis medinalis (A) and Chilo suppressalis (B).

The S1 spacer (74 bp), located between trnQ and nad2, appeared to be the result of a duplicated 20-bp segment and a poly-T motif within the last 16 bp with minor changes in C. medinalis. In addition, there was a triplication and a similar poly-T motif within the last 19 bp in C. suppressalis. The S2 spacer (41 bp) located between the trnY and cox1 genes was derived from the microsatellite '(TA)17' in C. medinalis. However, the S2 spacer (30 bp) found between the trnC and trnY genes was derived from a duplicated segment with minor changes in C. suppressalis. Spacer S3 (46 bp) was present between nad3 and trnA and contained a microsatellite '(AT)18' in C. medinalis. Furthermore, the S3 spacer (101 bp) in C. suppressalis featured a microsatellite '(AT)19' and a duplicated segment. Along with the S4 spacer, there was a 7 bp motif 'ATACTAA' in C. medinalis and C. suppressalis between trnS2(UCN) and nad1. In addition, there were overlapping nucleotides 23 bp in length scattered over seven locations in C. medinalis, and ones 62 bp in length scattered over nine locations in C. suppressalis (Table 2). The overlaps of the gene coding sequence between trnW and trnC (8 bp), atp8 and atp6 (7 bp), and atp6 and cox3 (1 bp) in the two mitogenomes indicate that the mature transcripts remain intact in order to preserve their respective coding sequences. There was a 7 bp motif 'AGCCTTA' between trnW and trnC in C. medinalis and C. suppressalis. An overlap of seven nucleotides 'ATGATAA' was observed between atp8 and atp6 in both C. medinalis and C. suppressalis, which is a common feature of many other insect mitogenomes. Furthermore, there was a 17 bp motif 'TTATAAGCTATTTAAAT' between trnF and nad5 in C. suppressalis.

The A+T-rich region

The A+T-rich region, flanked by the rrnS and trnM genes, was 339 bp in length with a 95.9% A+T content in C. medinalis and 348 bp in length with a 95.2% A+T content in C. suppressalis. There was a conserved structure that consisted of the motif 'ATAGT(A)' and a poly-T stretch in C. medinalis (Fig. 8A). A very similar pattern occurred in C. suppressalis, where the sequence 'ATAGA' was followed by a poly-T stretch (Fig. 8B). A duplicated 25-bp repeat element was found in C. medinalis, which was similar to that in C. suppressalis (a duplicated 31-bp repeat element). There was also a stem-and-loop structure in C. medinalis (Figs. 8A and 9), but not in C. suppressalis. Furthermore, a microsatellite '(TA)13' was observed in C. medinalis, while a poly-T motif and a triplicated 19-bp repeat element were detected in C. suppressalis (Fig. 8B).
Fig 8

Structures of the A+T-rich regions of Cnaphalocrocis medinalis (A) and Chilo suppressalis (B) mitogenomes.

Fig 9

The potential stem-loop structure found in the Cnaphalocrocis medinalis A+T-rich region. Boxed sequences in the flanking region of the stem-loop structures are the motifs, possessing 3' flanking 'GAAAT' and the 5' flanking 'TATA' (nucleotides in box) as described in Zhang et al. 5.

In this study, the amino acid sequences of the 13 PCGs were concatenated to reconstruct phylogenetic relationships. Based on morphological analysis and mitochondrial analysis of the species studied here, and other species from the literature, the relationships among six superfamilies in Lepidoptera were recontstructed, as shown in Fig. 10 52.
Fig 10

Phylogeny of lepidopteran insects. (A) Current hypothesis of lepidopteran superfamily relationships after Kristensen and Skalski (1999) 54. (B) Inferred phylogenetic relationships among Lepidoptera based on amino acid sequences of mitochondrial 13 PCGs using Bayesian inference (BI) and maximum likelihood (ML). Numbers at each node indicate bootstrap support; percentages of Bayesian posterior probabilities (first value) and ML bootstrap support values (second value), respectively. The dipteran D. melanogaster was used as an outgroup 33. The scale bar indicates the number of substitutions per site.

The C. medinalis and C. suppressalis mitogenomes were grouped with those of D. saccharalis, M. vitrata, O. furnacalis, and O. nubilalis in the family Pyralidae (Pyraloidea) and clustered with other surperfamilies, including the Papillionoidea (14 taxa), Tortricoidea (three taxa), Bombycoidea (seven taxa), Geometroidea (one taxon), and Noctuoidea (five taxa). The phylogenetic analyses indicated a close relationship between C. medinalis, C. suppressalis, and other pyralid moths, with a 100% bootstrapping value, which was consistent with the morphological classification. Bombycidae (two taxa), Saturniidae (four taxa), and Sphingoidae (one taxon) were clustered in one branch in the phylogenetic tree. Geometroidea was the superfamily most closely related to the Noctuoidea, a finding consistent with morphological analysis. Tortricoidea was the sister superfamily to the remaining lepidopteran superfamilies, in agreement with the tree topology 53, 54.

Discussion

The mitogenome of C. medinalis was reported for the first time in this study. Although the mitogenome of C. suppressalis was reported earlier 10, we independently sequenced the complete mitogenome as part of this study. We not only described the mitogenome features of C. medinalis and C. suppressalis, but also compared the two. Because the evolutionary aspects of the C. suppressalis mitogenome were not discussed in detail in the previous report 10, we presented our mitogenome sequence for C. suppressalis in the context of an evolutionary analysis of the genus, to provide an insight into species diversity. The sizes of the C. medinalis and C. suppressalis mitogenomes from our study were similar to those of other pyralid moths, with all of the 37 typical mitochondrial genes identified in both mitogenomes. The genome organization and order seen in C. medinalis and C. suppressalis were typical for insect mitogenomes (Table 1), and are the same as those of other pyralid moths 11-13. Four mitochondrial DNA gene fragments (cox1, cox2, nad1, and rrnL) and four microsatellite DNA markers [(AC)n, (GT)n, and two (CA)n loci] were previously used to perform a phylogenetic analysis of C. suppressalis from 18 localities in China, and it was found that C. suppressalis was highly differentiated among the different geographical populations 55. Although the mitogenome of C. suppressalis from our study had only a 94% sequence homology with the published C. suppressalis mitogenome 10 with a 15,465 bp length, its genomic organization and order were very similar. Also, the nucleotide composition and the A+T content of all components were similar in the two C. suppressalis mitogenomes, with a nucleotide composition of 80.6% and 79.7%, PCGs (78.9%, 77.7%), tRNAs (82.0%, 84.7%), rRNAs (85.0%, 81.2%), and A+T-rich region (95.2%, 94.2%), respectively. But for some components, the AT- and GC-skews were different between the two mitogenome sequences. It is possible that these selective nucleotide compositional biases can be attributed to natural selection and/or different mutational pressures 56. In this study, the start codons of the 13 PCGs in the C. suppressalis mitogenome were identical to the previously reported sequence 10, expect for the nad6 gene. The cox1 gene usually has the start codon CGA in most lepidopterans 24. The cox1 gene is considered to be one of more conserved mitochondrial genes, but its start codon is variable, which has been extensively discussed in relation to various insect and arthropod species 10, 33, 57. Some PCGs, especially the cox1 and cox2 genes, have incomplete stop codons in lepidopteran species, as previously reported in other insects 58-60. It is commonly believed that the TAA terminator results from post- transcriptional polyadenylation 61. The codon families with high CDs have a prevalence of A and T in the third position, which might reflect selection for optimal tRNA usage, genome bias, and speed and efficacy of the genome/DNA repair mechanisms 62. The sequences of the 13 PCGs have very high similarity between C. medinalis and C. suppressalis, but the codon distribution and RSCU of codon families are different in the two species. All codons were present in the C. medinalis PCGs, except for the GCA (Alanine), CGC (Arginine), CGT (Arginine), and GGC (Glycine) codons. Meanwhile, CGG (Arginine) and CTG (Leucine) were not present in the C. suppressalis PCGs, reflecting the influence of a strongly biased codon usage 63. This codon bias may have resulted from genetic mutation, selection pressures, or genetic drift 56. Another explanation may be the migratory habits of C. medinalis. The mitogenome is responsible for the oxidative reactions in cells, so it plays an important role during migration to provide adequate supplies of energy to meet migration demands. However, the PCGs within the C. suppressalis sequence in the present study had 3,721 codons, with a similar codon distribution and RSCU of codon families to those of the previously reported C. suppressalis mitogenome 10. The size and AT-content of the 22 tRNA genes in C. suppressalis were also similar to those of the published mitogenome (1,478 bp long with 84.7% A+T content) 10. There were a total of 26 mismatched base pairs in the previously published C. suppressalis mitogenome, including 14 G-U pairs, three U-U pairs, five A-C pairs, one A-A pair, one A-G pair, one C-C pair, and one C-U pair 10. Such mismatches are common in arthropod mitogenomes, and are mostly located in the acceptor and anticodon stems 64. The sizes of both the rrnL and rrnS genes in C. suppressalis from this study were shorter than the rRNAs (1,442 bp, 789 bp) reported previously in the literature 10. We also used a commonly accepted comparative approach to construct the secondary structure of rRNAs 65. The effect of base pairing is not perfect in the 5' half of the rrnL gene in C. medinalis (Fig. 5A). The mis-pairings were observed in the stem regions of H991 and other helices under the criteria of Watson-Crick pairs, but the secondary structure of rrnL proved that the effect of base pairing is profound in the C. suppressalis mitogenome (Supplementary Material: Fig. S2). The structures of the rRNAs we presented were based mainly on sequence comparison and mathematical methods, but the reason for high occurrence of mis-pairings in C. medinalis is not clear. In the secondary structure of the rrnS gene, the H47 portion has a small loop in C. medinalis and C. suppressalis compared to M. sexta, which seems to be variable among species and therefore useful to predict phylogenetic relationships with H39 and H367 8, 31. The helices of H673, H1047, H1068, and H1074 in C. medinalis and C. suppressalis are different in length and structure from those in Alloeorhynchus bakeri (Hemiptera) and M. sexta, but similar to Grapholita molesta, Liriomyza trifolii (Diptera) and Libelloides macaronius (Neuroptera) 22, 31, 66-68. Nucleotide variability among domains and helices is unevenly distributed in the secondary structures of the rrnL and rrnS genes. It is very interesting that some helices in the secondary structures of rrnL and rrnS are different among lepidopteran species, but very similar to those of other orders of insects. Although there is a slight difference in the sizes of three major intergenic spacers in C. suppressalis mitogenomes between the present study and the previously reported one 10, the spacers have similar genomic locations, with similar motifs. For example, the longest spacer is located between the trnC and trnY genes, which contains the microsatellite repeat '(AT)n' commonly observed in other insects 10, 11, 28, 29. In spacer S4, there is a 7 bp motif 'ATACTAA' in both C. medinalis and C. suppressalis. Similarly, in this region there is a 5 bp motif 'TACTA' that is conserved in Coleoptera 69, and a 6 bp conserved motif 'THACWW' found in Hymenoptera 8. Obviously, the similar motif found here is the most conserved one in Lepidoptera. The nucleotide overlap 'AGCCTTA' between the trnW and trnC genes was detected in both C. suppressalis mitogenome sequences and in other lepidopteran species 10, 13. Furthermore, a similar nucleotide overlap 'TTATAAGCTATTTAAAT' in the C. suppressalis mitogenome between trnF and nad5 was observed in the O. furnacalis (with 'TTATAAGCTATTTA') and O. nubilalis mitogenomes (with 'TTATAAGCTATTTAAA') 13, all of which suggests that spacer sequences can potentially be utilized for studying higher level phylogeny. In the mitogenome there is often a non-coding region rich in A and T (A+T-rich region), which varies considerably in length among insect species, or even within the same species 70. This region includes various copy numbers of certain tandem-repeat elements scattered through the entire region, such as TA repeat, poly A, and poly T. The combined structural motif 'ATAGA(T)' and a poly-T stretch have been widely observed in other lepidopteran mitogenomes, which might be the origin of light-strand replication 10, 70. The stem-and-loop structure, with the conserved 3' flanking 'G(A)nT' and the 5' flanking 'TATA' sequences, was observed in the C. medinalis mitogenome and is considered to be the site of the initiation of secondary strand synthesis 5, 70, 71 (Fig. 8A and 9). However, the stem-and loop structure often found in the A+T-rich region of insects was not present in C. suppressalis 10, but is found in several insect orders, including Orthoptera, Lepidoptera, Diptera, Plecoptera, and Hymenoptera 5, 40, 70, 72. The A+T-rich regions had many differences between the two C. suppressalis mitogenomes, probably due to the high variability. However, both of the mitogenomes contained the motif 'ATAGA,' followed by a 19 bp poly-T stretch. In this study, the phylogenetic tree that was reconstructed was in disagreement with previous research on six lepidopteran superfamilies 12, 30, 36, 41. The reason for this may be explained by our having access to more lepidopteran mitogenomes than did past studies. Such a larger set of entities might better explain the underlying phylogenetic relationships in the Lepidoptera than did previous studies. Our results provide some clarification about the placement of the Bombycoidea but were still broadly consistent with previous morphological classification of the Lepidoptera 53, 54. However, the addition of the C. medinalis and C. suppressalis mitogenomes to the literature did not resolve uncertainties about the position of Pyralidae. It will likely require the availability of still more mitogenomes to fully resolve the relationships among these surperfamilies of Lepidoptera. Fig. S1: Inferred secondary structures for 22 typical tRNAs of the Chilo suppressalis mitogenome. Fig. S2: Predicted secondary structure of the rrnL gene in the Chilo suppressalis mitogenome. Fig. S3: Predicted secondary structure of the rrnS gene in the Chilo suppressalis mitogenome. Click here for additional data file.
  64 in total

1.  Evolutionary dynamics of a mitochondrial rearrangement "hot spot" in the Hymenoptera.

Authors:  M Dowton; A D Austin
Journal:  Mol Biol Evol       Date:  1999-02       Impact factor: 16.240

Review 2.  Animal mitochondrial genomes.

Authors:  J L Boore
Journal:  Nucleic Acids Res       Date:  1999-04-15       Impact factor: 16.971

3.  Tandem repeats finder: a program to analyze DNA sequences.

Authors:  G Benson
Journal:  Nucleic Acids Res       Date:  1999-01-15       Impact factor: 16.971

4.  MODELTEST: testing the model of DNA substitution.

Authors:  D Posada; K A Crandall
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

5.  Maximizing transcription efficiency causes codon usage bias.

Authors:  X Xia
Journal:  Genetics       Date:  1996-11       Impact factor: 4.562

6.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.

Authors:  T M Lowe; S R Eddy
Journal:  Nucleic Acids Res       Date:  1997-03-01       Impact factor: 16.971

7.  The complete mitochondrial genome of the leafminer Liriomyza sativae (Diptera: Agromyzidae): great difference in the A+T-rich region compared to Liriomyza trifolii.

Authors:  Fei Yang; Yu-Zhou Du; Li-Ping Wang; Jing-Man Cao; Wei-Wei Yu
Journal:  Gene       Date:  2011-06-16       Impact factor: 3.688

8.  Complete mitochondrial genome of Chilo suppressalis (Walker) (Lepidoptera: Crambidae).

Authors:  Jiao Yin; Ai-Min Wang; Gui-Yun Hong; Ya-Zhong Cao; Zhao-Jun Wei
Journal:  Mitochondrial DNA       Date:  2011-06

9.  The mitochondrial genome of the ascalaphid owlfly Libelloides macaronius and comparative evolutionary mitochondriomics of neuropterid insects.

Authors:  Enrico Negrisolo; Massimiliano Babbucci; Tomaso Patarnello
Journal:  BMC Genomics       Date:  2011-05-10       Impact factor: 3.969

10.  The complete mitochondrial genome of the damsel bug Alloeorhynchus bakeri (Hemiptera: Nabidae).

Authors:  Hu Li; Haiyu Liu; Liangming Cao; Aimin Shi; Hailin Yang; Wanzhi Cai
Journal:  Int J Biol Sci       Date:  2011-11-24       Impact factor: 6.580

View more
  38 in total

1.  Description of new mitochondrial genomes (Spodoptera litura, Noctuoidea and Cnaphalocrocis medinalis, Pyraloidea) and phylogenetic reconstruction of Lepidoptera with the comment on optimization schemes.

Authors:  Xinlong Wan; Min Jee Kim; Iksoo Kim
Journal:  Mol Biol Rep       Date:  2013-09-22       Impact factor: 2.316

2.  Silencing OsMAPK20-5 has different effects on rice pests in the field.

Authors:  Xiaoli Liu; Jiancai Li; Ali Noman; Yonggen Lou
Journal:  Plant Signal Behav       Date:  2019-07-08

3.  Phylogenomics provides strong evidence for relationships of butterflies and moths.

Authors:  Akito Y Kawahara; Jesse W Breinholt
Journal:  Proc Biol Sci       Date:  2014-08-07       Impact factor: 5.349

4.  The complete mitochondrial genome of Vanessa indica and phylogenetic analyses of the family Nymphalidae.

Authors:  Youxue Lu; Naiyi Liu; Liuxiang Xu; Jie Fang; Shuyan Wang
Journal:  Genes Genomics       Date:  2018-06-14       Impact factor: 1.839

5.  Development and characterization of microsatellite markers for rice leaffolder, Cnaphalocrocis medinalis (Guenée) and cross-species amplification in other Pyralididae.

Authors:  Baoguang An; Xiaolong Deng; Huiyun Shi; Meng Ding; Jie Lan; Jing Yang; Yangsheng Li
Journal:  Mol Biol Rep       Date:  2014-01-01       Impact factor: 2.316

6.  Characterization and Phylogenetic Analysis of the Complete Mitochondrial Genome of Saturnia japonica.

Authors:  Jiang Liu; Junjun Dai; Jinjin Jia; Yemei Zong; Yahao Sun; Ying Peng; Lei Wang; Cen Qian; Baojian Zhu; Guoqing Wei
Journal:  Biochem Genet       Date:  2021-09-22       Impact factor: 1.890

7.  The complete mitochondrial genome of Damora sagana and phylogenetic analyses of the family Nymphalidae.

Authors:  Naiyi Liu; Na Li; Pengyu Yang; Chunqin Sun; Jie Fang; Shuyan Wang
Journal:  Genes Genomics       Date:  2017-10-17       Impact factor: 1.839

8.  Structural characteristics and phylogenetic analysis of the mitochondrial genome of the rice leafroller, Cnaphalocrocis medinalis (Lepidoptera: Crambidae).

Authors:  Yonghua Yin; Fujuan Qu; Zhongwu Yang; Xiuyue Zhang; Bisong Yue
Journal:  Mol Biol Rep       Date:  2013-12-31       Impact factor: 2.316

9.  The complete mitochondrial genomes of two ghost moths, Thitarodes renzhiensis and Thitarodes yunnanensis: the ancestral gene arrangement in Lepidoptera.

Authors:  Yong-Qiang Cao; Chuan Ma; Ji-Yue Chen; Da-Rong Yang
Journal:  BMC Genomics       Date:  2012-06-22       Impact factor: 3.969

10.  The First Mitogenomes of the Subfamily Odontiinae (Lepidoptera, Crambidae) and Phylogenetic Analysis of Pyraloidea.

Authors:  Mujie Qi; Huifeng Zhao; Fang Yu; Aibing Zhang; Houhun Li
Journal:  Insects       Date:  2021-05-24       Impact factor: 2.769

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.