Li Ma1, Wan-Wei Dong2, Guo-Fang Jiang2, Xing Wang3. 1. Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, Hunan Agricultural University, Changsha, Hunan 410128, China (lily.m1106@gmail.com). 2. Jiangsu Key Laboratory for Bioresource Technology, College of Life Sciences, Nanjing Normal University, Nanjing, Jiangsu, China, and. 3. Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, Hunan Agricultural University, Changsha, Hunan 410128, China (lily.m1106@gmail.com), wangxing@hunau.edu.cn.
Abstract
The sweet potato leaf folder, Brachmia macroscopa, is an important pest in China. The complete mitogenome, which consists of 13 protein-coding genes (PCGs), 22 transfer RNA genes, two ribosomal RNA genes, and an A + T-rich region, was sequenced and found to be 15,394 bp in length (GeneBank no. KT354968). The gene order and orientation of the B. macroscopa mitogenome were similar to those of other sequenced lepidopteran species. All of the PCGs started with ATN as the canonical start codon except for cox1, which started with CGA. In regard to stop codons, most PCGs stopped at TAA except for cox2, which stopped at TA, and nad4, which stopped at a single T. Thirteen PCGs of the available species (33 taxa) were used to demonstrate phylogenetic relationships. The ditrysian cluster was supported as a monophyletic clade at high levels by using maximum likelihood and Bayesian methods. The apoditrysian group, covering the Gelechioidea, formed a monophyletic clade with a bootstrap value of 88% and a posterior probability of 1.00. The superfamily Gelechioidea was supported as a monophyletic lineage by a posterior probability of 1.00.
The sweet potato leaf folder, Brachmia macroscopa, is an important pest in China. The complete mitogenome, which consists of 13 protein-coding genes (PCGs), 22 transfer RNA genes, two ribosomal RNA genes, and an A + T-rich region, was sequenced and found to be 15,394 bp in length (GeneBank no. KT354968). The gene order and orientation of the B. macroscopa mitogenome were similar to those of other sequenced lepidopteran species. All of the PCGs started with ATN as the canonical start codon except for cox1, which started with CGA. In regard to stop codons, most PCGs stopped at TAA except for cox2, which stopped at TA, and nad4, which stopped at a single T. Thirteen PCGs of the available species (33 taxa) were used to demonstrate phylogenetic relationships. The ditrysian cluster was supported as a monophyletic clade at high levels by using maximum likelihood and Bayesian methods. The apoditrysian group, covering the Gelechioidea, formed a monophyletic clade with a bootstrap value of 88% and a posterior probability of 1.00. The superfamily Gelechioidea was supported as a monophyletic lineage by a posterior probability of 1.00.
As a well-known leaf-eating pest, Brachmia macroscopa Meyrick, belonging to the family Gelechiidae in the superfamily Gelechioidea, occurs in many districts of China, including Shanghai, Zhejiang, Hunan, and Hainan (Wen and Wang 2010), and damages crops by feeding on large amounts of mesophyll (Wang and Tan 2011). This pest is widely distributed outside of China, in India, the Philippines, Burma, Vietnam, Korea, and Japan (Wang and Tan 2011). This pest has already led to severe crop failures and has brought great economic losses for farmers (Huang and Li 2013). In severe cases, the rate of crop damage has reached 60–85% in some fields (Wen and Wang 2010). However, its genetic characteristics have rarely been reported. To improve the management of B. macroscopa, it is important to know more information about this pest, including its genetic characteristics and phylogenetic position.Information can be inferred from the mitochondrial genome, both for phylogenetic analysis and evolutionary biology, and it can help us to understand an insects’ phylogenetic position. Compared with individual genes, the complete mitochondrial genome can be more informative, and can also provide more information on genome level characteristics; for instance, gene arrangement, gene content, base composition genetic codon variation, and transfer RNA (tRNA) and ribosomal RNA (rRNA) gene secondary structures. Lepidopteran mitochondrial DNA (mtDNA) is typically a circular molecule encoding a set of 13 protein-coding genes (PCGs), 22 tRNAgenes, and two rRNA genes (Wolstenholme 1992, Boore 1999). A major non-coding element, called the control (A + T-rich) region, is always present (Lewis et al. 1995, Zhang et al. 1995, Inohira et al. 1997, Shao et al. 2001), and this is the site of gene replication and the initiation of genome transcription (Boore 1999, Taanman 1999). The mtDNA datasets have become the most helpful markers in phylogenetics, phylogeography, and genetic population studies (Avise 2000, Cameron et al. 2007).Gelechioidea, consisting of 1,425 genera and 16,250 described species worldwide, is one of the largest and the most important lepidopteran taxa (Hodges 1998). It includes some important insect pests and occupies a very important position in the evolutionary tree of Lepidoptera. The mitogenomes of Gelechioidea have rarely been studied, with only a few sequenced mitogenomes available from five gelechioid species (09-2015), including four that are incomplete: Oegoconia novimundi (KJ508036), Perimede sp. (KJ508041), Ethmia eupostica (KJ508047), and Endrosis sarcitrella (KJ508037) (Timmermans et al. 2014), and a recently reported completely sequenced species, Promalactis suzukiella (Park et al. 2014). It seems that research on the phylogeny of Gelechioidea had always had some divergence or some unresolved points (Passoa 1995; Kaila 2004, Kaila et al. 2011; Bucheli and Wenzel 2005, Regier et al. 2009, Mutanen et al. 2010).In this article, the complete mitochondrial genome of B. macroscopa was first sequenced and described in comparison with other lepidopteran insects, and the phylogenetic position of Gelechioidea was analyzed based on the mitogenomic data for the purpose of understanding whether such a complete dataset can determine deeper phylogeny among the contested lepidopterans, as well as the Gelechioidea.
Materials and Methods
Sampling and DNA Extraction
Larvae of B. macroscopa were collected from the field (an experimental field belonging to our laboratory and involving no endangered/protected species) around Hunan Agriculture University (Changsha, Hunan, China; 28°11′N, 113°4′E) and were bred in the laboratory with fresh water spinach (Ipomoea aquatica Forsk) leaves. Pupae of the second generation were collected and extracted to obtain total genomic DNA using the Wizard Genomic DNA Purification Kit (Promega, Beijing, China) according the manufacturer’s instructions.
PCR Amplification and Sequencing
Specific primers were designed using Primer Premier 5 to compare with the known sequences and fragments of lepidopteran insects (Table 1). Universal primers were used for PCR-amplified short fragments of the mitogenome of B. macroscopa (Simon et al. 1994, Lee et al. 2006, Simon et al. 2006).
Table 1.
Primers used for amplification of the mitogenome of B. macroscopa
Fragment
Region
Primer (J/N)
Primer sequence (J/F) 5′→3′
F1
rrns-trnQ
SR-J-14610c
ATAATAGGGTATCTAATCCTAGT
Gln-Reb
GCACAATAATTT TTGATATTAGATATAGTTTA
F2
nad2
J-60 d
GGTATTTGATCAGGAATAGTAGGAA
N-735 d
CCAATAAATGGGGGTAATCCTCCTA
F3
nad2-cox1
J-489 d
GGGGGATTAAATCAAACCTC
N-C1Reb
GCTGTTACAATAGTATTATAA ATTTGATCATC
F4
cox1-cox2
C1-J-2167a
TTGATTTTTCGGACATCCTGAAGT
C1-N-3649a
CCGCAAATTTCTGAACATTGACCA
F5
cox2-nad3
C2-Jd
CCGCAAATTTCTGAACATTGACCA
N-5731c
TTTGGATCAAACCCACATTC
F6
cox3-nad5
C3-J-5470c
GCTGCAGCTTGATATTGACA
N5-N-7793d
AATCCTAATCCATCTCAACCT
F7
nad5-nad4
N5-J-7572c
AAAAGGAATTTGAGCTCTTTTAGT
N4-N-9153c
TGAGGTTATCAACCAGAGCG
F8
nad4-cytb
N4-J-8941c
GAAACTGGGGCTTCAACATGAGC
N-11328d
GGCAAATAGGAAATATCATTC
F9
cytb-nad1
CB-J d
CATATTCAACCCGAATGATA
N1-N-12588 d
AATCGAACTCCTTTTGATTTTGC
F10
nad1-rrnL
J-11876c
CGAGGTAAAGTACCACGAACTCA
16S-Nd
ATATGTACATATTGCCCGTC
F11
rrnL-rrnS
J-13900d
CTTGTGTATCAGAGTTTATTA
N-12SRd
GTAAAAGTTCAAATAGCAAG
aPrimers modified from Simon et al. (1994) prior to this mitogenome.
bPrimers modified from Lee et al. (2006) prior to this mitogenome.
cPrimers from Simon et al. (2006) prior to this mitogenome.
dPrimers newly designed for this genome.
Primers used for amplification of the mitogenome of B. macroscopaaPrimers modified from Simon et al. (1994) prior to this mitogenome.bPrimers modified from Lee et al. (2006) prior to this mitogenome.cPrimers from Simon et al. (2006) prior to this mitogenome.dPrimers newly designed for this genome.PCR amplification was carried out in 25 μL reactions that contained 0.2 μL rTaq (TaKaRa Co., Dalian, China), 1 μL DNA, 2.5 μL 10× rTaq buffer (Mg2 + free), 2.5 μL 25 mM MgCl2, 2.0 μL dNTPs, and 0.5 μL of each primer. PCR conditions were 94°C for 5 min, 35 cycles at 94°C for 30 s, 50–59°C for 30 s, 72°C for 1–2.5 min, and a subsequent 10 min final extension at 72°C. The amplified fragments were sequenced directly by the appropriate primers using a commercial kit.
Bioinformatic Analysis
Sequences were proof-read and assembled using the program Geneious version 4.8.4 (Drummond et al. 2010). PCG boundaries were identified with the ORF finder (http://www.ncbi.nlm.nih.gov/orf/gorf.html). After completely sequencing the mt genome, it was annotated both by hand and by automated methods. For the written annotation, the method presented by Cameron (2014) was followed. The automated annotation was accomplished with MITOS (Bernt et al. 2013). Confirmation of the tRNA genes was verified using the tRNAscan-SE program (http://lowelab.ucsc.edu/tRNAscan-SE/) (Lowe and Eddy 1997). Unidentified tRNAs were compared with sequences from other species. The two genes encoding the large and small rRNA subunits (rrnL and rrnS) were confirmed based on the rRNA alignments with other species published on NCBI, and the secondary structures were determined using Mfold Web Server (http://mfold.rna.albany.edu/?q = mfold).The comparative analysis and spread correction were performed by the software Mega 6.0 (Tamura et al. 2013) to obtain the complete mitogenome of B. macroscopa (Tamura et al. 2013). The related lepidopteran sibling species were analyzed by blast searches on the NCBI database. The skews of the compositions were determined using the formulas: AT skew = [A − T]/[A + T]; GC skew = [G − C]/[G + C] (Junqueira et al. 2004).
Phylogenetic Analysis
Along with the B. macroscopa mitochondrial genome, 29 available lepidopteran species (Son and Kim 2011, Gong et al. 2012) and four hepialoid mitogenomes (Napialus hunanensis, Thitarodes pui, Ahamus yunnanensis, and Thitarodes renzhiensis) (Cao et al. 2012, Yi et al. 2016a, 2016b) were used in the phylogenetic analysis (Table 2). The nucleic acid regions and amino acid sequences from all 13 PCGs were aligned using Mega 6.0 (Tamura et al. 2013). Gblock 0.91b with default settings was used with conserved regions of the putative amino acids (Castresana, 2000).
Table 2.
Source and information for the phylogenetic analysis
Superfamily
Family
Species
Accession number
Length (bp)
References
Urodoidea
Urodidae
Urodus decens
KJ508062
15,279
Timmermans et al. (2014)
Yponomeutoidea
Plutellidae
Plutella xylostella
NC_025322
16,014
Dai et al. (2016)
Yponomeutidae
Prays oleae
NC_025948
16,499
van Asch et al. (2016)
Lyonetiidae
L. malifoliella
NC_018547
15,646
Wu et al. (2012)
Tineoidea
Tineidae
Tineola bisselliella
KJ508045
15,661
Timmermans et al. (2014)
Zygaenoidea
Zygaenidae
Rhodopsona rubiginosa
KM244668
15,248
Tang et al. (2014)
Tortricoidea
Tortricidae
Adoxophyes honmai
NC_008141
15,680
Lee et al. (2006)
Grapholita dimorpha
NC_024582
15,813
Niu et al. (2016)
Adoxophyes orana
NC_021396
15,343
Wu et al. (2013)
Cydia pomonella
NC_020003
15,253
Shi et al. (2013)
Rhyacionia leptotubula
NC_019619
15,877
Zhu et al. (2012)
Grapholita molesta
NC_014806
15,717
Son and Kim (2011)
HQ116416
15,776
Gong et al. (2012)
Spilonota lechriaspis
NC_014294
15,368
Zhao et al. (2011)
Choristoneura longicellana
NC_019996
15,759
Unpublished
Acleris fimbriana
NC_018754
15,933
Zhao et al. (2014)
Epiphyas postvittana
KJ508051
15,451
Timmermans et al. (2014)
Retinia pseudotsugaicola
NC_022865
15,282
Unpublished
Gelechioidea
Oecophoridae
E. sarcitrella
KJ508037
15,317
Timmermans et al. (2014)
P. suzukiella
KM875542
15,507
Park et al. (2014)
Elachistidae
E. eupostica
KJ508047
15,347
Timmermans et al. (2014)
Cosmopterigidae
Perimede sp.
KJ508041
15,131
Timmermans et al. (2014)
Autostichidae
O. novimundi
KJ508036
15,408
Timmermans et al. (2014)
Gelechiidae
B. macroscopa
KT354968
15,400
present study
Cossoidea
Cossidae
Eogystia hippophaecolus
NC_023936
15,431
Gong et al. (2013)
Hepialoidea
Hepialidae
N. hunanensis
NC_024424
15,301
Yi et al. (2016a)
T. pui
NC_023530
15,064
Yi et al. (2016b)
A. yunnanensis
NC_018095
15,816
Cao et al. (2012)
T. renzhiensis
NC_018094
16,173
Cao et al. (2012)
Gracillarioidea
Gracillariidae
Phyllonorycter platani
KJ508044
15,791
Timmermans et al. (2014)
Phyllonorycter froelichiella
KJ508048
15,538
Timmermans et al. (2014)
Cameraria ohridella
KJ508042
15,513
Timmermans et al. (2014)
Copromorphoidea
Carposinidae
Carposina sasakii
NC_023212
15,611
Wu et al. (2016)
Source and information for the phylogenetic analysisThe 13 individual best fitting models for the nucleic acid dataset were defined with the Akaike Information Criterion (AIC) and the jModelTest 2.1.5 (Ronquist and Huelsenbeck 2003). Then, Bayesian inference (BI) analysis was performed via MrBayes v3.2.3 (Darriba et al. 2012), with the MCMC analysis run for 1,000,000 generations and a burn-in series of 1,000. In addition, the maximum likelihood (ML) method was conducted with RAxML v 8.0.2 (Stamatakis 2014), in which the 13 best fit substitution models for the 13 separate PCGs (aa sequence) were determined on the Protest web server in terms of AIC (Abascal et al. 2005).
Results
Genome Organization and Base Composition
The complete mtDNA of B. macroscopa was featured as a 15,394 bp closed circle (Fig. 1) and encoded 37 genes as well as containing a putative A + T-rich region. The 37 genes were organized by 13 PCGs (cox1-3, nd1-6, atp6, atp8, nad4L, and cytb), 22 tRNA genes and 2 rRNA genes (Table 3). Twenty of the genes were transcribed on the major strand (J-strand), and the other 17 were transcribed on the minor strand (N-strand).
Fig. 1.
Circular map of the B. macroscopa mitogenome. Annotations of the gene names were the standard abbreviations adopted in this article; single letters were drawn on the basis of IUPAC-IUB abbreviation for their matching amino acid.
Table 3.
Summary of the B. macroscopa mitogenome
Gene
Strand
Nucleotide no.
Size(bp)
IN
Anticodon
Start codon
Stop codon
trnM
J
1–67
67
8
CAT
trnI
J
76–142
67
4
GAT
trnQ
N
147–214
70
83
TTG
nad2
J
298–1,284
987
−2
ATA
TAA
trnW
J
1,283–1,350
68
−8
TCA
trnC
N
1,343–1,409
67
11
GCA
trnY
N
1,421–1,489
69
3
GTA
cox1
J
1,493–3,028
1536
−5
CGA
TAA
trnL(UUR)
J
3,024–3,091
68
0
TAA
cox2
J
3,092–3,778
687
−5
ATG
TA-
trnK
J
3,774–3,844
71
0
CTT
trnD
J
3,845–3,913
69
9
GTC
atp8
J
3,923–4,078
156
−7
ATA
TAA
atp6
J
4,072–4,749
678
−1
ATG
TAA
cox3
J
4,749–5,537
789
38
ATG
TAA
trnG
J
5,576–5,643
68
−3
TCC
nad3
J
5,641–5,997
357
2
ATA
TAA
trnA
J
6,000–6,067
67
−1
TGC
trnR
J
6,067–6,130
64
0
TCG
trnN
J
6,131–6,196
66
3
GTT
trnS(AGN)
J
6,200–6,265
66
7
GCT
trnE
J
6,273–6,335
63
1
TTC
trnF
N
6,337–6,402
66
1
GAA
nad5
N
6,404–8,146
1743
−10
ATT
TAA
trnH
N
8,137–8,201
65
0
GTG
nad4
N
8,202–9,540
1339
0
ATG
T-
nad4L
N
9,541–9,834
294
2
ATG
TAA
trnT
J
9,837–9,902
66
0
TGT
trnP
N
9,903–9,972
70
2
TGG
nad6
J
9,975–10,502
528
14
ATA
TAA
cytb
J
10,517–11,671
1155
2
ATA
TAA
trnS(UCN)
J
11,674–11,743
70
3
TGA
nad1
N
11,747–12,703
957
−6
ATA
TAG
trnL(CUN)
N
12,698–12,768
71
16
TAG
rrnL
N
12,769–14,225
1457
0
trnV
N
14,226–14,291
66
0
TAC
rrnS
N
14,292–15,069
778
0
A + T-rich
–
15,070–15,394
325
0
Circular map of the B. macroscopa mitogenome. Annotations of the gene names were the standard abbreviations adopted in this article; single letters were drawn on the basis of IUPAC-IUB abbreviation for their matching amino acid.Summary of the B. macroscopa mitogenomeThe intergenic spacer sequence was 184 bp long in total and was made up of 16 regions, which varied from 1 to 83 bp in length, with two major intergenic spacer sequences with 65.8% of base pairs of the entire spacer regions located between trnQ and nad2 (83 bp), and cox3 and trnG (38 bp).In addition, 10 overlaps of 48 bp in length were present in the B. macroscopa mitogenome. The smallest was only 1 bp, and the largest was just 10 bp, located between nad and trnH.The overall nucleotide composition of B. macroscopa mtDNA was 41.0% A, 39.9% T, 11.3% C, and 7.7% G. The A + T content was significantly biased (80.9%). The positive and negative skew were obviously distinct with an AT skew value of 0.014 and GC skew value of −0.188.
Codon Usage in PCGs
Regarding the PCGs, the major strand included nad2, cox1, cox2, atp8, atp6, cox3, nad3, nad6, and cytb, while the minor strand harbored nad5, nad4, nad4L, and nad1. Almost all of the PCGs started with ATN except for cox1, which started with CGA. Eleven of 13 PCGs ended with TAA, except for cox2 with TA, and nad4 with a single T.The PCGs contained a total of 3,593 codons, excluding the start and termination codons. AUU (Ile), AAU (Asn), UUU (Phe), UUA (Leu), and AUA (Met) were the most abundant amino acid codons, and made up 47.1% of the total. The content of A + T was usually higher than G + C according to the summarized codon usage (Fig. 2A). The relative synonymous codon usage (RSCU) of the third position showed that the frequency of AU codons in two and fourfold degeneracy was greater than GC (Fig. 2B).
Fig. 2.
Codon usage in the B. macroscopa mitogenome. (A) CDspT (codons per thousand) indicate the codons used in coding amino acids per thousand codons. Codon families are given on the x-axis. (B) RSCU.
Codon usage in the B. macroscopa mitogenome. (A) CDspT (codons per thousand) indicate the codons used in coding amino acids per thousand codons. Codon families are given on the x-axis. (B) RSCU.
The Secondary Structure for RNA Genes
The two rRNA genes, 16 s (rrnL) and 12 s (rrnS), were 1,457 and 778 bp in length, respectively. The rrnL gene was situated in trnS and trnV, and the rrnS gene was located between trnV and the A + T-rich region. Both of these genes were located on the N-strand. The complete secondary structures of the rrnL and rrnS molecules were determined and are shown in Figs. 3 and 4, respectively. The rrnL secondary structure contained six domains, made up of two conserved and four variable regions, and three other domains appeared in the rrnS secondary structure. Twenty-two tRNA genes, ranging from 63 to 71 bp in length, were contained in the B. macroscopa mitogenome. Fourteen of these genes were mapped on the J-strand and eight were mapped on the N-strand. Typical cloverleaf secondary structures were also found in this species, except for trnS which lacked a dihydrouridine arm (Fig. 5).
Fig. 3.
Predicted secondary structure of the rrnL gene in the B. macroscopa mitogenome. Red-colored letters stand for variable positions and blue for conserved positions. Tertiary structures are noted by continuous lines. Base-pairing is delineated as follows: Watson-Crick pairs are joined by lines, GU pairs by dots, and other non-canonical base pairs by asterisks. Each domain is indicated with Roman numerals.
Fig. 4.
Predicted secondary structure of the rrnS gene in the B. macroscopa mitogenome. Red-colored letters stand for variable positions and blue for conserved positions. Tertiary structures are noted by continuous lines. Base-pairing is delineated as follows: Watson-Crick pairs are joined by lines, GU pairs by dots, and other non-canonical base pairs by asterisks. Each domain is indicated with Roman numerals.
Fig. 5.
Predicted secondary structure of tRNAs gene in the B. macroscopa mitogenome. Dashes (–) indicate Watson-Crick base-pairing, centered solid pentagram () indicate G-U base-pairing, and solid triangles (▴) indicate mismatches.
Predicted secondary structure of the rrnL gene in the B. macroscopa mitogenome. Red-colored letters stand for variable positions and blue for conserved positions. Tertiary structures are noted by continuous lines. Base-pairing is delineated as follows: Watson-Crick pairs are joined by lines, GU pairs by dots, and other non-canonical base pairs by asterisks. Each domain is indicated with Roman numerals.Predicted secondary structure of the rrnS gene in the B. macroscopa mitogenome. Red-colored letters stand for variable positions and blue for conserved positions. Tertiary structures are noted by continuous lines. Base-pairing is delineated as follows: Watson-Crick pairs are joined by lines, GU pairs by dots, and other non-canonical base pairs by asterisks. Each domain is indicated with Roman numerals.Predicted secondary structure of tRNAs gene in the B. macroscopa mitogenome. Dashes (–) indicate Watson-Crick base-pairing, centered solid pentagram () indicate G-U base-pairing, and solid triangles (▴) indicate mismatches.
Features in the A + T-Rich Region
The A + T-rich region, located between the trnM and rrnS genes, was 325 bp long with a higher content (96.7%) and the absence of a large repeating fragment. The conserved structure consisted of an ‘ATAGT’-like motif, and included a poly-T. The poly-T was considered to be the origin of the minor strand replication (Fig. 6A). Three microsatellites, ‘(TA)8’, ‘(TA)9’, and ‘(TA)7’, were observed in this region, and they were located 205, 241, and 261 bp upstream of rrnS, respectively. The conserved ‘ATTTA’ sequence was located between the microsatellite (TA)8 and (TA)9. A potential stem-loop structure existed in the A + T-rich region without the ‘TATA’ sequence at the 5′ end and without the ‘G(A)nT’ at the 3′ end (Fig. 6B).
Fig. 6.
The A + T-rich region of B. macroscopa mitogenome. (A) Structure of the A + T-rich region. (B) A potential stem-loop structure found in the A + T-rich region.
The A + T-rich region of B. macroscopa mitogenome. (A) Structure of the A + T-rich region. (B) A potential stem-loop structure found in the A + T-rich region.
Phylogenetic Relationships
The phylogenetic relationships among the superfamilies within the Ditrysia were reconstructed and are shown in Figs. 7 and 8. The topological structures of the two trees were almost identical. The ditrysian group was supported as a monophyletic group by a high bootstrap value of 100% and a posterior probability of 1.00. The results from both ML and BI analysis show that the Tortricoidea is a robust monophyletic group supported by a high bootstrap value of 100% and is a sister to the clade ((Urodoidea + Copromorphoidea) + (Gelechioidea + (Cossoidea + Zygaenidae))) supported by a bootstrap value of 100% and a posterior probability of 1.00. However, bootstrap values of all small clades within the latter clade are lower. These six superfamilies are, together, a sister of the clade (Yponomeutoide +Gracillarioidea), in which Gracillarioidea is a robust monophyletic group, whereas Yponomeutoide is a polyphyletic group. Tortricoidea were recovered as a sister to the rest of Apoditrysia in the analyses, and this was supported by a bootstrap value of 100% and a posterior probability of 1.00. The Gelechioidea was assumed be a sister group to the Apoditrysia (Cho et al. 2011), but it was not supported in our results because the Gelechioidea was nested within the clade Apoditrysia in the two trees. The Apoditrysia group (Gelechioidea, Zygaenoidea, Cossoidea, Copromorphoidea, Urodoidea, and Tortricoidea) was a monophyletic clade with a bootstrap value of 100% and a posterior probability of 1.00. A sister relationship between Yponomeutoidea + Gracillarioidea and the Apoditrysia was supported by a bootstrap value of 88% and a posterior probability of 1.00. The superfamily Tineoidea was a monophyletic clade with a bootstrap value of 100% and a posterior probability of 1.00. This is the earliest clade derived from Ditrysia, supporting the general agreement that tineoids are the oldest ditrysian superfamily (Minet 1991, Kristensen and Skalski 1998). The superfamily Gelechioidea, which includes O. novimundi, E. sarcitrella, E. eupostica, B. macroscopa, and Perimede sp. in this study, formed a monophyletic group supported by a posterior probability of 1.00 (Fig. 8).
Fig. 7.
Phylogeny of lepidopteran superfamilies including B. macroscopa. ML phylogram using the concatenated 13 PCGs of mitogenomes obtained in these species, which used partitioned models for analysis. The scale bar indicates the number of substitutions per spot, and values of each node specify bootstrap percentages of 1,000 replicates.
Fig. 8.
BI phylogram using the concatenated 13 PCGs of mitogenomes obtained in these species, which used partitioned models for analysis. The scale bar indicates the number of substitutions per spot, and values of each node specify bootstrap percentages of 1,000 replicates.
Phylogeny of lepidopteran superfamilies including B. macroscopa. ML phylogram using the concatenated 13 PCGs of mitogenomes obtained in these species, which used partitioned models for analysis. The scale bar indicates the number of substitutions per spot, and values of each node specify bootstrap percentages of 1,000 replicates.BI phylogram using the concatenated 13 PCGs of mitogenomes obtained in these species, which used partitioned models for analysis. The scale bar indicates the number of substitutions per spot, and values of each node specify bootstrap percentages of 1,000 replicates.
Discussion
Genome Organization and Base Characteristics
The gene order and orientation of the B. macroscopa mitogenome were identical to the fully sequenced ditrysian species with the gene order trnM-trnI-trnQ. Nevertheless, it was different from the gene order trnI-trnQ-trnM in the non-ditrysian lineage of Lepidoptera (Flook et al. 1995, Yi et al. 2014).The AT nucleotides in the B. macroscopa mitogenome skewed slightly, and the content was 81.0%, which is the same as other lepidopteran insects in the region; such as, Lobocla bifasciatus, Argynis nerippe (Kim et al. 2011, 2014), and Argynnis hyperbius (Wang et al. 2012). The AT skew was 0.013, which indicated the occurrence of A more than T. The AT content in the PCGs was 79.2%, which is similar to that of Potanthus flavus (Kim et al. 2014) and Kallima inachus (Qin et al. 2012). The AT content of cox1 was the lowest, at 72.1%, whereas atp8 was significantly higher, at 95.5%, among the PCGs. Numerous studies have come to the conclusion that there is usually more A than T, and more C than G on the J strand, but the situation might occasionally be reversed for some species (Wei et al. 2010). We, however, observed that PCGs on the J-strand had a positive AT-skew and GC-skew. That means that this species has more A than T, and less C than G on the major strand. This novel observation had never before been made in the mitogenome of any insect. Regrettably, we have not elucidated the mechanism of this phenomenon. Nevertheless, early studies found that the value of the GC skew was not associated with gene direction but with replication orientation, while the value of the AT skew could change over gene direction, replication, and codon positions (Wei et al. 2010).
Intergenic Spacer Regions and Overlapping
The spacer is located commonly between trnQ and nad2, but it is not conserved, as its length varies in lepidopteran species. It was the longest of all spacers of B. macroscopa. This result might support the conclusion that the spacer between trnQ and nad2 has no functional significance or that it acts as another origin of replication (Cameron and Whiting 2008). In addition, the spacer, which has only been shown in the lepidopteran insects, had high homology with adjacent nad2 genes (Xia et al. 2011).The common overlap (7 bp), between atp8 and atp6 that is widely distributed in other lepidopteran mitogenomes (Jiang et al. 2009, Zhu et al. 2013, Chen et al. 2014), was also present in the B. macroscopa mitogenome. Meanwhile, the position of the maximum overlap between nad5 and trnH was different from other lepidopteran mitogenomes, such as Diaphania pyloalis, which has the maximum overlap between trnF and nad5 (Zhu et al. 2013), Chilo suppressalis, which also has the maximum overlap between trnF and nad5 (Chai et al. 2012), and Attacus atlas, which has the maximum overlap between trnW and trnC (Chen et al. 2014).
Protein-Coding Genes
‘ATA’ and ‘ATG’ were present as start codons with the same frequency among the 12 PCGs, while ‘CGA’ was the start codon of cox1. The sequencing of the 5′ region of cox1 genes from 39 lepidopteran species indicated the conservation of ‘CGA’ in lepidopteran mitogenomes and implied that ‘CGA’ may be a synapomorphic feature in Lepidoptera (Kim et al).Eleven of the 13 PCGs ended with ‘TAA’ codons, while the remaining two stopped with ‘TTA’ or a single ‘T’. The incomplete termination codons could be completed by the mRNA process of polyadenylation (Anderson et al. 1981). It is important that the incomplete stop codons could make up the complete TAA in the assembly process of mRNA (Boore 1999, Gong et al. 2012). It was concluded that A and U were more frequently used in PCGs because the value for the RSCU of NNU and NNA codons was always >1.
rRNA and tRNA Genes
The rRNA genes of insects are usually conserved in mitogenomes. Therefore, the secondary structures for rrnL and rrnS RNA in B. macroscopa are similar to those of other Lepidoptera (Gillespie et al. 2006, Niehuis et al. 2006a, b; Chai et al. 2012). The base pairs in the rrnL gene did not fully comply with Watson-Crick base-pairing. The mispairing of H991 in the rrnL secondary structures was not observed in Leucoptera malifoliella (Lepidoptera: Lyonetiidae) (Fig. 4), but it is very common in lepidopteran insects; such as, Apocheima cinerarius (Lepidoptera: Geometridae), C. suppressalis (Lepidoptera: Pyralidae), Manduca sexta (Lepidoptera: Sphingidae), and Zygaena sarpedon lusitanica (Lepidoptera: Zygaenidae). The H47, H673, H1047, and H1241 in the secondary structure of the rrnS gene were different from those of M. sexta in structure and length. The H47 portion was one of the variable sites among the species (Gillespie et al. 2006, Gong et al. 2012); therefore, it could provide valuable information on the phylogenetic relationship of H39 and H367 (Wei et al. 2010). The majority of the tRNA is likely to fold into an iconic clover-leaf secondary structure, except for trnS. The trnS was the only special case that could not form a complete cloverleaf structure, but the incomplete structure evolved early in the metazoans (Garey and Wolstenholme 1989, Wolstenholme 1992). However, previous research has shown that the trnS is highly conserved in nearly all families, as well as in B. macroscopa.Knowledge of insects involves insect ecology, behavior, systematics, host plant choices, and so on, but information about the worldwide superfamily Gelechioidea has been very limited (Hodges 1998). The sister group of Gelechioidea is still unknown (Mutanen et al. 2010). Mitogenomic phylogeny research was carried out by Timmermans et al. (2014), and it demonstrated that the Gelechioidea was not grouped with other lower Apoditrysia. The phylogenetic systematics of 10 superfamilies in Lepidoptera were constructed based on mitochondrial data that showed a sister relationship between Gelechioidea and Zygaenidae + Cossoidea, because both of them cluster in a same node and have a positive Bayesian posterior probability of 1.00. Significantly, our study may support the Gelechioidea belonging to the Apoditrysia, and it differed from the findings of Timmermans et al. (2014). The two analyses were both conducted based on mitogenome data, and whether these differences were due to the introduction of the complete mitogenome in this study remains to be investigated. The phylogenetic relationship among the Yponomeutoide, Gracillarioidea, Gelechioidea, and Apoditrysia were not well resolved by Kristensen et al. (2007). In our results, the Yponomeutoide + Gracillarioidea and Apoditrysia were sister groups and clustered on the same branch, with a bootstrap value of 88% and a high posterior probability of 1.00. Although the value of ML was lower, it can still provide some information for the uniform relationships. The structure was supported by Regier et al. (2013), and it had many differences from the nested Yponomeutoide + Gracillarioidea and Gelechioidea (Kristensen and Skalski 1998).The ML and BI analyses have different advantages in reconstructing phylogenetic trees, and in some instances the results may be different according to the different methods. The results of this study, which revealed the monophyly of Gelechioidea, was mostly supported by previous analyses that were based on morphology, ecology, and mitogenomes (Passoa 1995, Kaila 2004, Bucheli and Wenzel 2005, Regier et al. 2009, Mutanen et al. 2010, Kaila et al. 2011). Although the ML value was lower, the two separate analyses correctly interpreted the monophyly of Gelechioidea. Six species were chosen, representing five subfamilies (Symmocinae, Oecophoninae, Ethmiinae, Dichomeridinae, and Chrysopeleiinae). According to the analysis of Mutanen et al. (2010), Autostichidae was not monophyletic unless Glyphydoceridae and Deoclonidae were included. The tree, in Figs. 7 and 8, showed that Autostichidae and Oecophoridae clustered in the same node, and therefore the monophyly of Autostichidae was not confirmed. With further observations, the Ethmiinae and the other two subfamilies had higher bootstrap values and strongly resembled the earlier results of phylogenetic relationships indicated by Passoa (1995) and Bucheli and Wenzel (2005). The consensus on familial relationships ((Autostichidae + Oecophoridae) + Elachistidae) + (Gelechiidae + Cosmopterigidae) by ML and ((Elachistidae + Gelechiidae) + Oecophoridae) + (Autostichidae + Cosmopterigidae) by BI obviously conflicted with the latest molecular analysis that revealed different affinities (Elachistidae + (Cosmopterigidae + Gelechiidae) + (Autostichidae + Oecophoridae)) (Heikkilä et al. 2014). The challenge of getting better mitogenome-based phylogenies probably requires that more Gelechioidea mitogenomes are sequenced.
Authors: Jerome C Regier; Charles Mitter; Andreas Zwick; Adam L Bazinet; Michael P Cummings; Akito Y Kawahara; Jae-Cheon Sohn; Derrick J Zwickl; Soowon Cho; Donald R Davis; Joaquin Baixeras; John Brown; Cynthia Parr; Susan Weller; David C Lees; Kim T Mitter Journal: PLoS One Date: 2013-03-12 Impact factor: 3.240