Literature DB >> 25860387

Morphological characters are compatible with mitogenomic data in resolving the phylogeny of nymphalid butterflies (lepidoptera: papilionoidea: nymphalidae).

Qing-Hui Shi1, Xiao-Yan Sun2, Yun-Liang Wang3, Jia-Sheng Hao4, Qun Yang2.   

Abstract

Nymphalidae is the largest family of butterflies with their phylogenetic relationships not adequately approached to date. The mitochondrial genomes (mitogenomes) of 11 new nymphalid species were reported and a comparative mitogenomic analysis was conducted together with other 22 available nymphalid mitogenomes. A phylogenetic analysis of the 33 species from all 13 currently recognized nymphalid subfamilies was done based on the mitogenomic data set with three Lycaenidae species as the outgroups. The mitogenome comparison showed that the eleven new mitogenomes were similar with those of other butterflies in gene content and order. The reconstructed phylogenetic trees reveal that the nymphalids are made up of five major clades (the nymphaline, heliconiine, satyrine, danaine and libytheine clades), with sister relationship between subfamilies Cyrestinae and Biblidinae, and most likely between subfamilies Morphinae and Satyrinae. This whole mitogenome-based phylogeny is generally congruent with those of former studies based on nuclear-gene and mitogenomic analyses, but differs considerably from the result of morphological cladistic analysis, such as the basal position of Libytheinae in morpho-phylogeny is not confirmed in molecular studies. However, we found that the mitogenomic phylogeny established herein is compatible with selected morphological characters (including developmental and adult morpho-characters).

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25860387      PMCID: PMC4393276          DOI: 10.1371/journal.pone.0124349

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The family Nymphalidae (Lepidoptera: Papilionoidea) is the largest group of butterflies with about 7,200 species distributed on all continents except Antarctica [1-3]. They are usually medium to large sized butterflies with striking colors, specialized insect-plant interaction and a deep rooted evolutionary history probably 90 million years ago [4, 5]. Owing to their species richness and ecological diversification, nymphalids have been used as model taxa for a wide range of evolutionary and ecological studies [6]. However, the phylogeny of Nymphalidae is one of the most controversial issues in insect studies [7-10]. For instance, in Chou’s taxonomic system, widely adopted in China, the nymphalid butterflies are split into 10 morphological subfamilies [4], while the most commonly adopted classification of the family, proposed by Ackery et al. [11], Harvey [12] and Müller [13], consists of 13 subfamilies. Recently, although efforts have been focused on the phylogeny of Nymphalidae by using morphological, molecular (mitochondrial and nuclear genes), or combined data, problems still exist, particularly about the phylogenetic status of Biblidinae, Pseudergolinae, Cyrestinae, Morphinae, Libytheinae and Charaxinae [14-21]. Common to previous molecular phylogenetic studies are small number of molecular markers used with conflicting results and often weak supports about their phylogenetic relationships. The insect mitochondrial genomes (mitogenomes) provide abundant molecular markers for phylogenetic and population genetic studies at various hierarchical levels, due to their unique features, such as maternal inheritance, lack of extensive recombination and moderate nucleotide substitution rates [22, 23]. Up to now, existing information about butterfly mtDNAs, especially the complete genomes of mtDNA are still limited. Only 22 nymphalid mitogenomes so far deposited in GenBank (up to Jun 15, 2014) represent eight of the thirteen nymphalid subfamilies. However, the corresponding sequences of the other major nymphalid groups (e.g., Charaxinae, Pseudergolinae, Cyrestinae, Biblidinae and Morphinae) are still lacking. In this study, we sequenced nine complete and two nearly complete mitogenomes of nymphalids, representing nine subfamilies (Hypolimnas bolina for Nymphalinae, Cethosia bibles for Heliconiinae, Polyura nepenthes for Charaxinae, Ariadne ariadne for Biblidinae, Elymnias hypermnestra, Lethe dura and Callerebia suroia for Satyrinae, Parantica aglea for Danainae, Cyrestis thyodamas for Cyrestinae, Dichorragia nesimachus for Pseudergolinae and Stichophthalma howqua for Morphinae) and compared them with other available nymphalid mitogenomes. A phylogenetic analysis based on mitogenomic data set, assisted with morphological comparison, is attempted to further resolve the nymphalid phylogeny.

Materials and Methods

Ethics Statement

All samples of butterfly species used in this study were collected from the mountains or peri-urban areas of China, where no specific permissions were required for these locations or activities. There are no endangered or protected species and all the all samples are common nymphalid butterfly species which are not included in the “List of Protected Animals in China” and other related lists of animal species protection in the world. Thus the sampling in this study did not violate any law, rule or regulation in China and all around the world, requiring no ethical or institutional approval.

Samples and DNA extraction

The samples of the eleven newly sequenced nymphalid species were collected in China from 2006 to 2012 (S1 Table). After sample collection, fresh individuals were preserved in 100% ethanol and stored at -20°C until used for genomic DNA extraction. Total genomic DNA was extracted from the thorax muscle of a single individual using the Sangon Animal genome DNA Extraction Kit in accordance with the manufacturer instructions (Shanghai, China).

Primer design, PCR amplification, and sequencing

Insect universal primers were initially used for the amplification of partial fragments of four genes (cox1, cox3, cob and rrnS gene) [24-26], approximately 500~700 bp in length. Primers for amplification of the nad2, nad5 and overlapping long fragments were designed via the sequence alignment of all the other sequenced butterfly mitogenomes available, using ClustalX 2.1 and Primer Premier 5.0 software [27, 28]. All primers were synthesized by the Sangon Biotechnology Co. Ltd., Shanghai, China. Six long fragments (cox1cox3, cox3nad5, nad5cob, cob—rrnS, rrnS—nad2, nad2cox1) were amplified via long PCR technique with the cycling parameters: 30 cycles of denaturation at 95°C for 50 seconds, annealing at 49–55°C (depending on primer pairs) for 50 seconds, and elongation at 68°C for 150 seconds, and a final extension at 68°C for 10 minutes. The PCR products were visualized by electrophoresis on 1.2% agarose gel, then purified using a 3S Spin PCR Product Purification Kit and sequenced directly for the majority of fragments or by cloning the rrnS—nad2 for some species, with an ABI-377 automatic DNA sequencer. All of the long PCR fragments were sequenced using the primer walking strategy.

Sequence assembly, gene annotation, and analysis

Sequences from overlapping fragments were initially assembled via the alignment of neighboring fragments using the BioEdit 7.0 [29] and ClustalX 2.1 softwares [27]. Protein-coding genes (PCGs) and ribosomal RNA genes (rRNAs) were identified using ClustalX 2.1 software [27] and the NCBI Internet BLAST search function. Identification of transfer RNA genes (tRNAs) was conducted using software tRNAscan-SE 1.21 (http://lowelab.ucsc.edu/tRNAscan-SE/) [30]. Putative tRNAs that could not be found by tRNAscan-SE were identified by alignment with those of the other butterfly species available. The tandem repeats in the A+T-rich region were predicted using the Tandem Repeats Finder online (http://tandem.bu.edu/trf/trf.html) [31]. Nucleotide composition and codon usage were calculated by MEGA 5.0 software [32]. All three codon positions of the 13 PCGs concatenated nucleotide dataset were tested independently for substitution saturation, by plotting the number of observed substitutions against the genetic distance estimates using MEGA 5.0 software [32]. The rRNA secondary structures were predicted after the models for Drosophila [33], Apis mellifera [34] and Manduca sexta [35]. Stem-loops were named using both the conventions of Gillespie et al. [34] and Niehuis et al. [36, 37] with the former notation first for each time they are mentioned.

Phylogenetic analysis

Thirty three nymphalid species with complete or nearly complete mitogenome sequence data, including eleven newly determined in this study, were used in the phylogenetic analyses, representing all currently recognized subfamilies of Nymphalidae (Table 1). Three species, Coreana raphaelis, Spindasis takanonis and Protantigius superans from Lycaenidae, being sister group of Nymphalidae, were selected for outgroup comparisons.
Table 1

List of the nymphalid species analyzed in this study.

SubfamilyTribeSpeciesSize (bp)Genbank accession no.Resources
CharaxinaeCharaxini Polyura nepenthes 15,333KF990128This study
Calinaginae - Calinaga davidis 15,267HQ658143Xia et al., 2011
Pseudergolinae - Dichorragia nesimachus 14,367KF990126This study
SatyrinaeMelanitini Melanitis Leda 15,122JF905446Shi et al., 2013a
Satyrini Eumenis autonoe 15,489GQ868707Kim et al., 2010
Satyrini Callerebia suroia 15,208KF906483Shi et al., 2014
Elymniini Elymnias hypermnestra 15,167KF906484This study
Elymniini Lethe dura 15,259KF906485This study
Morphinae - Stichophthalma howqua 14,020KF990129This study
HeliconiinaeArgynnini Argyreus hyperbius 15,156JF439070Wang et al., 2011
Argynnini Fabriciana nerippe 15,140JF504707Kim et al., 2011a
Argynnini Issoria lathonia 15,172NC_018030Unpublished
Heliconiini Heliconius melpomene 15,325HE579083Dasmahapatra et al., 2012
Acraeini Acraea issoria 15,245GQ376195Hu et al., 2010
Acraeini Cethosia biblis 15,211KF990124This study
LimenitidinaeLimenitidini Parathyma sulpitia 15,268JQ347260Tian et al., 2012
Apaturinae - Apatura ilia 15,242JF437925Chen et al., 2012
- Apatura metis 15,236NC_015537Zhang et al., 2012
- Sasakia charonda 15,244NC_014224Wang et al., 2012
- Sasakia funebris 15,233JX131328Wang et al., 2013
- Timelaea maculata 15,178KC572131Cao et al., 2013
Biblidinae - Ariadne ariadne 15,179KF990123This study
Cyrestinae - Cyrestis thyodamas 15,254KF990125This study
NymphalinaeKallimini Kallima inachus 15183JN857943Qin et al., 2012
Melitaeini Melitaea cinxia 15,170NC_018029Unpublished
Junoniini Junonia orithya 15,214KF199862Shi et al., 2013b
Junoniini Hypolimnas bolina 15,260KF990127This study
Libytheinae - Libythea celtis 15,164NC_016724Hao et al., 2013
DanainaeDanaini Euploea mulciber 15,166NC_016720Hao et al., 2013
Danaini Danaus plexippus 15,314NC_021452Servín and Martínez, 2013
Danaini Danaus chrysippus 15,236KF690637Gan et al., 2014a
Danaini Ideopsis similes 15,200KJ476729Gan et al., 2014b
Danaini Parantica aglea 15,210KM018020This study
outgroupTheclini Coreana raphaelis 15,314NC_007976Kim et al., 2006
Theclini Protantigius superans 15,248NC_016016Kim et al., 2011b
- Spindasis takanonis 15,349NC_016018Kim et al., 2011b
The phylogenetic trees were reconstructed with the neighbor joining (NJ), maximum parsimony (MP), maximum likelihood (ML) and Bayesian inference (BI) methods based on two different nucleotide datasets as follows: D1 (13 PCGs only) and D2 (13 PCGs plus 2 rRNAs). The mitochondrial genes were separately aligned by MUSCLE in MEGA 5.0 software [32]. The PCGs were aligned according to their amino sequence similarity, whereas rRNAs were directly aligned according to sequence similarity using default settings. Then each of individual alignments was concatenated as a combined matrix. However, the scattergrams analysis showed that the substitution for third codon position of 13 PCGs trends toward saturation, potentially obscuring phylogenetic signals. Therefore, all third codon positions were excluded during the phylogenetic analysis. The NJ analysis was performed in the MEGA5.0 [32], the MP and ML analyses were carried out using the PAUP*4.0b10 [38], and the BI analysis was conducted using MrBayes 3.1.2 [39]. In the NJ analysis, nucleotide substitution model of the Kimura-2 parameter (K2P) was selected, and the bootstrap proportion values (BPs) with internal branch tests were obtained by 1,000 replicates to estimate the support levels for the nodes in the resultant topologies. The MP tree was reconstructed with branch swapping tree bisection-reconnection (TBR) heuristic search method, and the BPs were obtained after 1000 replicates by using 10 replicates of random stepwise additions of taxa. For ML and BI analyses, the two datasets were further partitioned by 4 strategies considering gene region and codon position. For D1 dataset, the partitioning strategies (PSs) were set as (1) 13 PCGs each as a single gene partition (D1-P13), (2) two partitions, including 1st and 2nd codon position of the PCGs (D1-P2). The PSs of D2 dataset were set as (3) 15 gene partitions (D2-P15) and (4) four partitions, including 1st and 2nd codon position of the PCGs and two rRNAs (D2-P4). The optimal substitution model of each partition was determined by Modeltest 3.7 [40], using the corrected Akaike information criterion (AICc). For ML analysis, the general time reversible model with invariant sites and among—site variation (GTR+I+G) was selected as the best fit model for each partition, and the BPs of the tree were also evaluated via the bootstrap test with 1000 iterations. The BI analysis was conducted with the same best fit substitution model used as the one selected for the ML analysis. Two independent runs of four incrementally heated MCMC chains (one cold chain and three hot chains) were simultaneously run for one million generations in all datasets, with each sampling of 100 generations, when the convergence of MCMC chains was achieved (<0.01), the first 25% of the sampled trees were discarded as samples of burn-in. The confidence values of the BI tree are presented as the Bayesian posterior probabilities (BPP). All the tree topologies were evaluated using the Approximately Unbiased (AU) [41], Shimodaira-Hasegawa (SH) [42], Kishino-Hasegawa (KH) [43], weighted SH (WSH) [42] and weighted KH (WKH) [43] methods by the Consel v0.2 software [44].

Results and Discussion

Genome structure and organization

In this study, nine complete mitogenome sequences from P. nepenthes (15,333 bp), A. ariadne (15,179 bp), H. bolina (15,260 bp), C. biblis (15,211 bp), E. hypermnestra (15,167 bp), L. dura (15,259 bp), C. suroia (15,208 bp), C. thyodamas (15,254 bp), P. aglea (15,210 bp) and two nearly complete mitogenomes from D. nesimachus (14,367 bp) and S. howqua (14,020 bp) were determined, representing nine subfamilies of Nymphalidae (Tables 1 and 2). All genes identified in the eleven mitochondrial genomes are typical insect mitochondrial genes with normal gene sizes [45]. In all, 37 genes (13 PCGs, 22 tRNAs, 2 rRNAs) and an A+T-rich region were identified in the nine completely sequenced mitogenomes, in which, the gene order was identical to that of the other nymphalid mitogenomes sequenced to date, but different from the gene order in inferred ancestral insects [45] (Fig 1). The regions that we failed to sequence in other two mitogenomes were A+T-rich region and gene cluster trnM—trnI—trnQ, which are usually located in or around rrnS and nad2 (Fig 1), where extremely high A+T content and stable stem-loop structures may have disrupted PCR and sequencing reactions, a common problem in sequencing insect mitochondrial genomes [23, 46].
Table 2

Characterization of the nymphalid mitogenomes used in this study.

SpeciesWhole genomePCG a rrnL rrnS AT-rich region
Size (bp)A+T (%)Size (bp) b A+T (%)Size (bp)A+T (%)Size (bp)A+T (%)Size (bp)A+T (%)
Apatura ilia 15,24280.511,13378.91,33384.077684.940392.5
Apatura metis 15,23680.511,10978.91,33384.577984.839492.9
Sasakia charonda 15,24479.911,10878.21,32384.477585.038091.8
Sasakia funebris 15,23381.211,18180.01,33384.677285.537093.0
Timelaea maculate 15,17881.111,14579.61,33285.077785.938293.2
Ariadne ariadne 15,17980.111,16378.81,32083.979984.131992.8
Cyrestis thyodamas 15,25481.111,18179.71,32884.678085.138091.6
Kallima inachus 15,18380.311,16379.21,33582.777485.137692.0
Melitaea cinxia 15,17080.011,15478.61,33684.677284.633892.9
Hypolimnas bolina 15,26079.711,15478.11,33383.377684.935693.3
Junonia orithya 15,21480.411,15479.21,32682.777584.933194.9
Dichorragia nesimachus 14,36781.111,00180.31,36985.062183.3--
Argyreus hyperbius 15,15680.811,15479.51,33084.577885.234995.4
Acraea issoria 15,24579.711,15178.01,33183.978883.743096.0
Fabriciana nerippe 15,14080.911,15779.61,32184.477384.932995.4
Issoria lathonia 15,17281.111,15479.91,31984.477185.136196.1
Heliconius melpomene 15,32881.711,10680.31,36485.777985.126895.9
Cethosia biblis 15,21179.811,16378.41,31683.777485.533493.4
Parathyma sulpitia 15,26881.911,18780.61,31984.777985.734994.6
Libythea celtis 15,16481.211,16680.01,33584.777585.432896.3
Eumenis autonoe 15,48979.111,18476.81,33583.777585.367894.5
Melanitis leda 15,12279.811,16378.41,33284.077485.031489.5
Elymnias hypermnestra 15,16780.411,16079.11,33184.776885.040590.6
Lethe dura 15,25979.311,18177.51,34183.877585.740992.2
Callerebia suroia 15,20879.511,12177.91,34584.377785.739394.2
Stichophthalma howqua 14,02078.510,84877.01,33284.660084.2--
Polyura nepenthes 15,33380.811,21479.41,38984.977284.543789.5
Calinaga davidis 15,26780.411,21178.81,33783.877385.938992.0
Euploea mulciber 15,16681.511,13980.21,31484.677685.339993.5
Danaus plexippus 15,31481.311,14879.91,34084.477486.146994.5
Danaus chrysippus 15,23680.311,14578.71,31483.877984.741895.0
Ideopsis similes 15,20081.611,12180.21,31584.777685.940495.7
Parantica aglea 15,21079.611,12177.81,30984.177584.944393.2

▲ Partial mitogenome lacking in the A+T-rich region, trnM, trnI and trnQ, partial nad2 and rrnS sequence. Bar (-) indicates lack of sequence information for the A + T region in the genome.

a Protein coding genes.

b Termination codons were excluded in total size count.

Fig 1

Gene arrangement of eleven nymphalid mitochondrial genomes sequenced in this study.

Abbreviations for the genes are as follows: cox1, cox2, and cox3 refer to the cytochrome oxidase subunits, cob refers to cytochrome b, and nad1—nad6 refer to NADH dehydrogenase components, rrnL and rrnS refer to ribosomal RNAs. Transfer RNA genes are denoted by one letter symbols according to the IPUC-IUB one-letter amino acid codes. L1, L2, S1, S2 denote tRNALeu(CUN), tRNALeu(UUR), tRNASer(AGN), tRNASer(UCN), respectively. Genes coded on the majority strand are light/dark-green. Genes coded on the minority strand are red or orange. Alternation of colors was applied for distinction. The non-coding regions are presented as cyan/yellow dots. The unknown portions of partial mtDNAs are gray.

▲ Partial mitogenome lacking in the A+T-rich region, trnM, trnI and trnQ, partial nad2 and rrnS sequence. Bar (-) indicates lack of sequence information for the A + T region in the genome. a Protein coding genes. b Termination codons were excluded in total size count.

Gene arrangement of eleven nymphalid mitochondrial genomes sequenced in this study.

Abbreviations for the genes are as follows: cox1, cox2, and cox3 refer to the cytochrome oxidase subunits, cob refers to cytochrome b, and nad1nad6 refer to NADH dehydrogenase components, rrnL and rrnS refer to ribosomal RNAs. Transfer RNA genes are denoted by one letter symbols according to the IPUC-IUB one-letter amino acid codes. L1, L2, S1, S2 denote tRNALeu(CUN), tRNALeu(UUR), tRNASer(AGN), tRNASer(UCN), respectively. Genes coded on the majority strand are light/dark-green. Genes coded on the minority strand are red or orange. Alternation of colors was applied for distinction. The non-coding regions are presented as cyan/yellow dots. The unknown portions of partial mtDNAs are gray. All analyzed nymphalid mitogenomes are consistently AT biased, with values from 79.1% in Eumenis autonoe (subfamily Satyrinae) to 81.9% in Parathyma sulpitia (subfamily Limenitidinae), averaging at 80.5% (Table 2). However, these values fall within the range of the A+T content for other Lepidoptera species, from 77.8% in Ochrogaster lunifer [47] to 82.7% in C. raphaelis [48]. The nucleotide skewness statistics for all nymphalid mitogenomes indicate slight A or T skews with AT-skew values ranging from -0.068 in C. biblis (subfamily Heliconiinae) to 0.019 in H. bolina (subfamily Nymphalinae), and moderate C skews with GC-skew values ranging from -0.274 in S. howqua (subfamily Morphinae) to -0.178 in P. Sulpitia (subfamily Limenitidinae) (S1 Fig). A similar trend has been observed in other lepidopterans, the AT-skew of which ranges from -0.047 (C. raphaelis) to 0.059 (Bombyx mori) and the GC-skew from -0.318 (O. lunifer) to -0.158 (C. raphaelis) [47, 48].

Protein coding genes

Like those of other determined lepidopteran mitogenomes, nine PCGs (nad2, cox1, cox2, atp8, atp6, cox3, nad3, nad6, and cob) of the eleven mitogenomes are coded on the majority strand, while the other four genes (nad1, nad4, nad4L, nad5) coded on the minority strand. All PCGs are initiated by typical ATN, with the exception of the cox1 which uses the unusual CGA(R) as observed in most other sequenced lepidopterans [35, 47, 49–54]; all PCGs are terminated with TAA/TAG, or with truncated codons TA or T which are presumed to be completed via post-transcriptional polyadenylation [55, 56] (S2 Table). The A+T contents of the PCGs, excluding stop codons, were similar to other nymphalid mitogenomes, and fall within the range from 76.8% in E. autonoe (subfamily Satyrinae) to 80.6% in P. sulpitia (subfamily Limenitidinae), with the average value equal to 79.0%. In all newly sequenced mitogenomes, the third codon positions have a considerably higher A+T content than the first and second positions (S2 Fig) as seen in other sequenced nymphalids. The abundance of codon families and Relative Synonymous Codon Usage (RSCU) in 13 PCGs were investigated for the eleven newly determined mitogenomes, as shown in Figs 2 and 3. All stop codons, complete and incomplete, were excluded from the analysis to avoid biases due to incomplete stop codons. Total number of non-stop codons (CDs) in nine newly sequenced complete mitogenomes is very similar to that in other available nymphalid mitogenomes, ranging from 3,695 of S. charonda (subfamily Apaturinae) to 3,738 of Polyura nepenthes (subfamily Charaxinae) (S3 Fig). Although the numbers of CDs in two nearly complete mitogenomes are less than that in other complete mitogenomes, the codon families exhibit a very similar behavior among all newly sequenced mitogenomes. The five most frequently used codon families (Leu (UUR), Ile, Phe, Met and Asn), each with at least 60 CDs per thousand CDs, were two-fold degenerate in codon usage and were rich in A and T in every mitogenome. The three codon families with at least 100 CDs per thousand (Leu (UUR), Ile and Phe) were found in the C. biblis (subfamily Heliconiinae), A. ariadne (subfamily Biblidinae), H. bolina (subfamily Nymphalinae), D. nesimachus (subfamily Pseudergolinae), E. hypermnestra and C. suroia (subfamily Satyrinae) mitogenomes, whereas only Leu (UUR) and Ile units were found in remaining five newly sequenced mitogenomes. However, the rarest used codon family is Cys, which displayed no more than 10 CDs per thousand in eleven mitogenomes (Fig 2). Additionally, the RSCU values of NNU and NNA are greater than 1 (excluding the NNA for Leu (UUR)), indicating a strong A+T-bias in their third codon position, whereas the lost codons usually belong to G+C-rich codons (Fig 3). The codon usage bias may be positively correlated with the AT bias of the third codon position for the insect mitogenomes [47, 49, 52, 57–60].
Fig 2

Codon distribution in the eleven newly sequenced nymphalid mitogenomes.

Numbers to the left refer to the total number of codons. CDspT, codons per thousand codons. Codon families are provided on the x axis.

Fig 3

Relative synonymous codon usage (RSCU) in the eleven newly sequenced nymphalid mitogenomes.

Codon families are given on the x axis. Codons that are not present in the genome are indicated in purple.

Codon distribution in the eleven newly sequenced nymphalid mitogenomes.

Numbers to the left refer to the total number of codons. CDspT, codons per thousand codons. Codon families are provided on the x axis.

Relative synonymous codon usage (RSCU) in the eleven newly sequenced nymphalid mitogenomes.

Codon families are given on the x axis. Codons that are not present in the genome are indicated in purple.

Transfer RNA and ribosomal RNA genes

All tRNAs of the eleven nymphalid species harbor the typical cloverleaf structures commonly found in insects, except for trnS1(AGN) whose dihydrouridine (DHU) arm is replaced by a simple loop (maps of secondary structure are not shown here). Two rRNA (rrnL and rrnS) genes are located between trnL1 (CUN) and trnV, and between trnV and the A+T-rich region, respectively (Fig 1). Of these two rRNAs, the larger one (rrnL gene) has attracted much more attention in the studies of systematic evolution and classification [61]. Here, we first compared the entire secondary structure of all the nymphalid rrnL genes for more phylogenetically informative signals. The results showed that all the rrnL genes harbored six domains (domain III is absent in Arthropoda) with 49 helices, which is broadly congruent with those proposed for other insects [46, 47]. D. nesimachus was shown in S4 Fig as an example with the remaining species not listed. Among this nymphalid species, some of the rrnL highly variable regions, such as the stem-loops of H837/D13, 14 in domain II and H2077/G3 in domain V were also shown (S5 and S6 Figs). The structural characteristics of H837/D13, 14 and H2077/G3, suggest that the A. ariadne (subfamily Biblidinae) is closer with C. thyodamas (subfamily Cyrestinae) than with D. nesimachus (subfamily Pseudergolinae), the S. howqua (subfamily Morphinae) is most likely to be related to Satyrinae species, and so on.

Non-coding regions

The non-coding (NC) regions of eleven newly sequenced mitogenomes were illustrated in Fig 1. There were three distinct large NC regions with at least 15 bp in all complete mitogenomes. The first large NC region located between trnQ and nad2, appeared to be common in the nine mitogenomes (including P. nepenthes, A. ariadne, H. bolina, C. biblis, E. hypermnestra, L. dura, C. suroia, C. thyodamas, P. aglea), and this region was also detected in other nymphalid mitogenomes, varying in length from 40 bp in E. hypermnestra (subfamily Satyrinae) to 87 bp in S. charonda and Sasakia funebris (subfamily Apaturinae). The second large NC region inserted between trnS2 and nad1, was present in A. ariadne (subfamily Biblidinae), H. bolina (subfamily Nymphalinae), C. biblis (subfamily Heliconiinae), C. thyodamas (subfamily Cyrestinae), C. suroia (subfamily Satyrinae) and P. aglea (subfamily Danainae) mitogenomes, whereas this region was absent in E. hypermnestra and L. dura (subfamily Satyrinae) mitogenomes or showed no more than 15 bp in P. nepenthes (subfamily Charaxinae) mitogenome. The third large NC region located between rrnS and trnM, coincided with the A+T-rich region, also called the CR. The CR regions have no conspicuous macro-repeat units (>50 bp), which are occasionally found in other lepidopteran insects [62-65]. Nonetheless, several conserved structures characteristic of lepidopterans are observed, such as the poly-T stretch preceded by the ATAGA motif neighboring the rrnS gene, and the microsatellite-like elements (TA)n (n = 7–9) preceded by the ATTTA motif. Although the CR regions were not sequenced in two incomplete mitogenomes (D. nesimachus and S. howqua), a few long NC regions (>15 bp in length), including the region located between trnS2 and nad1, were also observed in D. nesimachus (subfamily Pseudergolinae) mitogenome, however, the similar region was not found in S. howqua (subfamily Morphinae) mitogenome. The phylogenetic reconstructions conducted in this study using multiple methods have produced generally similar topology for most of the major nymphalid groups. The phylogenetic relationships inferred by BI and ML methods were highly congruent and strongly supported (BPP >0.95 and BP >85% for most nodes), across the partitions for each of the two datasets (Fig 4). The topology tests show that the BI trees from PS1 (BI-P13 tree) (Fig 4A), PS2 (BI-P2 tree) (not shown), PS3 (BI-P15 tree) (Fig 4B) and PS4 (BI-P4 tree) (not shown) are the best trees for the D1 and D2 dataset (Table 3). All the trees reveal that the Nymphalidae include five major clades, namely, the nymphaline clade (including five subfamilies: Apaturinae, Biblidinae, Cyrestinae, Nymphalinae and Pseudergolinae); heliconiine clade (including two subfamilies: Heliconiinae and Limenitidinae); satyrine clade (including four subfamilies: Satyrinae, Morphinae, Charaxinae and Calinaginae); danaine clade (including single subfamily: Danainae); libytheine clade (including one subfamily: Libytheinae). These results are consistent with those obtained from previous molecular studies [5, 14, 66], whereas remarkably different from the results of morphology-based phylogenetic studies, such as the cladistic analysis by Freitas and Brown [15] based on 234 characters of different developmental stages. In order to further evaluate the probable congruence between molecular and morphological evolutions, we carefully examined all the morphological characters from Freitas and Brown [15] and others and tried to map some of them on the mitogenomic phylogenetic trees robustly obtained in this study (Fig 5), indicating that some of the selected morphological characters (developmental and adult) are somewhat compatible with the molecular phylogeny. For example, the danaine as the basal group in Nymphalidae is supported by a unique morphological plesiomorphy, the connected medius 1 vein (M1) and radius vein (R), whereas the remaining nymphalid groups lack the morphological character (Fig 5A, green hexagon); the satyrine is characterized by a closed discal cell of the forewing, whereas its sister groups (libytheine + heliconiine + nymphaline) have a slightly closed or open discal cell of the forewing (Fig 5A, cyan hexagon).
Fig 4

Phylogenetic relationships among 33 nymphalid species.

(A) Bayesian Inference and maximum likelihood phylogram obtained with the D1 dataset, which is divided into 13 partitions (D1-P13). (B) Bayesian Inference and maximum likelihood phylogram obtained with the D2 dataset, which is divided into 15 partitions (D2-P14). Numbers on each node correspond to the posterior probability values of the BI analysis (left) and the ML bootstrap percentage values for 1 000 replicates of ML analysis (right).

Table 3

Topological tests for two datasets with partitions.

datasetstreesstatistical tests*
obsAUKHSHWKHWSH
D1BI-P13/P20.00.8210.0000.8380.0000.838
ML-P13/P20.00.4540.3180.7630.3180.700
NJ37.70.0970.0850.1240.0850.138
MP38.90.0680.0550.1130.0550.099
D2BI-P15/P40.00.6290.5220.7670.5220.770
ML-P15/P40.00.5520.4780.6510.4780.800
MP0.50.5510.4770.6510.0410.525
NJ15.40.1970.1980.2780.1980.257

* obs: the log-likelihood difference δx to the best tree; AU = Approximately Unbiased; KH = Kishino-Hasegawa; SH = Shimodaira-Hasegawa; WKH = weighted KH; WSH = weighted SH.

Fig 5

Morphological character distribution mapped on the mitogenomic tree of Nymphalidae and internal clades (All the morphological characters selected are taken after Freitas and Brown, 2004).

(A) the topology of Nymphalidae clades; (B) the topology of satyrine clade; (C) the topology of heliconiine clade; (D) the topology of nymphaline clade.

Phylogenetic relationships among 33 nymphalid species.

(A) Bayesian Inference and maximum likelihood phylogram obtained with the D1 dataset, which is divided into 13 partitions (D1-P13). (B) Bayesian Inference and maximum likelihood phylogram obtained with the D2 dataset, which is divided into 15 partitions (D2-P14). Numbers on each node correspond to the posterior probability values of the BI analysis (left) and the ML bootstrap percentage values for 1 000 replicates of ML analysis (right). * obs: the log-likelihood difference δx to the best tree; AU = Approximately Unbiased; KH = Kishino-Hasegawa; SH = Shimodaira-Hasegawa; WKH = weighted KH; WSH = weighted SH.

Morphological character distribution mapped on the mitogenomic tree of Nymphalidae and internal clades (All the morphological characters selected are taken after Freitas and Brown, 2004).

(A) the topology of Nymphalidae clades; (B) the topology of satyrine clade; (C) the topology of heliconiine clade; (D) the topology of nymphaline clade. The danaids (danaine clade), formerly classified as a family under the order Lepidoptera, are now treated as a subfamily of Nymphalidae. Our study indicates that Danainae (represented by Euploea mulciber, Danaus plexippus, Danaus chrysippus, Ideopsis similes, P. aglea) is sister to the remaining nymphalids including the libytheines with strong supports (BPP = 1.00 in two BI trees and BP = 100% in two ML trees, respectively), same as results from other recent analyses [54, 66–68]. Additionally, the analysis of mitochondrial rrnL secondary structures by Hao et al. [69] showed that the rrnL morphological characteristics of the danaid species were markedly different from other nymphalid groups. Morphologically, the danaids possess some of their unique features, such as the connected M1 vein and R vein in the forewing, which are markedly distinct from other nymphalid butterflies. The satyrine clade includes Satyrinae, Morphinae, Charaxinae and Calinaginae, compatible with results of some previous studies based on morphological [15], molecular [10, 14, 70], or combined data [5, 18]. The Calinaginae (represented by Calinaga davidis) is the basal group of entire satyrine clade, instead of the sister relationship to Charaxinae as previously stated [14]. The Charaxinae (represented by P. nepenthes) is the sister to the grouping of Satyrinae plus Morphinae (Fig 3), consistent with those of multiple-gene based study [20] and most comprehensive taxa-sampling studies [5]. The association of Charaxinae with Satyrinae plus Morphinae is also supported by the morphological features, for example, all groups of satyrine clade, excluding the subfamily Calinaginae, have a bifid tail during their larval stages (Fig 5B). However, the phylogenetic position of Morphinae was not stable among trees from different dataset (Fig 4). Morphologically, the Morphinae has some distinct characteristics, such as the filiform setae on the third abdominal segment (3A) and the eighth abdominal segment (8A) during the larval stage [15] (Fig 5B). In addition, the Amathusiini and Elymniini occupy the same host plant (Arecaceae) during the larval stages, implying a close relationship. It has been proposed that Morphinae (Morphini, Brassolini and Amathusiini) should be grouped within Satyrinae [5, 20, 70–72], though other studies suggested two independent subfamilies [12, 14]. The phylogenetic status of libytheines (libytheine clade) within nymphalids has long standed as a controversial issue. They were traditionally treated as a separate family owing to their unusual morphological features such as their exceptionally long labial palpus and fully developed forelegs in female [73-75]. However, some scholars proposed that libytheines should be included in Nymphalidae as a subfamily for the presence of longitudinal ridges on the antenna shared with Nymphalidae [12, 76], sister to all other Nymphalidae because of the lack of apomorphic features such as the simple female foreleg [7, 8, 15]. Evidences from host plant use and geographic distribution together suggest that this group is sister to the remaining Nymphalidae, as a basal subfamily [9, 77, 78], and some recent studies also suggested Libytheinae or the grouping of Libytheinae plus Danainae was the basal lineage of the Nymphalidae in molecular or molecular plus morphological view [5, 10, 14, 20, 79]. In this study, L. celtis (subfamily Libytheinae) was revealed as the sister of the grouping of nymphaline plus heliconiine clades in all phylogenetic trees with strong supports, except the ML-P13 tree (33% BP value at the node) (Fig 4A). This result was concordant with previous mitogenomic phylogenies [67, 68]. Thus, the phylogenetic position of Libytheinae remains uncertain. The heliconiine clade comprises representatives of the subfamilies Heliconiinae and Limenitidinae. In this study, though our taxon sampling is limited, tribal-level relationships within subfamily Heliconiinae are well supported (Fig 4). The tribe Argynnini (represented by Argyreus hyperbius, Fabriciana nerippe, Issoria lathonia) is sister to the group composed of Acraeini (represented by Acraea issoria, C. biblis) + Heliconiini (represented by Heliconius melpomene) (BPP = 1.00 in two BI trees, BP = 100%, 99% in ML-P13 and ML-P15 trees respectively). This outcome is concordant with those reported earlier based on molecular data [5, 10, 14, 21, 80], but inconsistent with two morphologically-based studies [15, 81], in which Acraeini emerged as sister to the grouping of (Heliconiini + (Vagrantini + Argynnini)). The final major clade, the nymphaline clade, contains representatives of the subfamilies Apaturinae, Biblidinae, Cyrestinae, Nymphalinae and Pseudergolinae. However, their internal relationships, especially regarding the Pseudergolinae, Biblidinae and Cyrestinae, are still controversial [5, 14, 20, 21]. In this study, the Pseudergolinae (reprented by D. nesimachus) is placed as sister to other groups of the nymphaline clade with strong or moderate support values (BPP = 1.0 in two BI trees, BP = 71%, 88% in ML-P13 and ML-P15 trees) (Fig 4), in concordance with the result of Wahlberg et al. [5]. The C. thyodamas (subfamily Cyrestinae) is the sister of A. ariadne (subfamily Biblidinae) with strong or moderate support (BPP = 0.93 in BI-P13 and BPP = 0.77 in BI-P15 tree), but weakly supported in ML-P13 tree (BP = 50%) and ML-P15 tree (BP = 33%) (Fig 4). This relationship is consistent with that by Zhang et al. [21], but contradicting that of Walhberg et al. [5] and Wahlberg and Wheat [20], which suggested that the Cyrestinae was sister to Nymphalinae. Morphologically, the Cyrestinae and Biblidinae are similar, as they share the short forewing discal cell, which is not found in other nymphalid groups. As for the Apaturinae, in this study, their five representatives (Apatura ilia, Apatura metis, S. charonda, S. funebris, Timelaea maculata) formed a strongly supported monophyletic group with the Biblidinae plus Cyrestinae as the sister group, albeit with weak support in ML-P13 tree (BP = 42%) and ML-P15 tree (BP = 39%). All of the three subfamilies were grouped with the Nymphalinae (represented by Kallima inachus, Melitaea cinxia, H. bolina, Junonia orithya) as its sister taxon, formerly revealed as the Heliconiinae by Freitas and Brown [15], Biblidinae + Apaturinae by Brower [10], Apaturinae by Wahlberg et al. [14] or Cyrestinae by Wahlberg et al. [5]. We found that the internal relationships of the nymphaline clade established in this study are consistent with their morphological pattern; e.g., all groups of nymphaline clade, excluding the basal group (Pseudergolinae), have a strongly marked longitudinal ridges during the egg stage; the Nymphalinae is characterized by an additional scolus on the ninth abdominal segment (9A) posteroventral to the filiform setae during the larval stage, which is absent in its sister groups ((Biblidinae + Cyrestinae) + Apaturinae) [15] (Fig 5D).

Scatter plot of AT- and GC-skews in the nymphalid subfamilies.

Composition skewness was calculated according to the formulas (AT-skew = [A-T] / [A+T]; GC-skew = [G-C] / [G+C]). All the species that are represented are listed in Table 1. (TIF) Click here for additional data file.

The variation of AT content in different codon position of the concatenated 13 PCGs in the Nymphalidae mitogenomes.

The 1st, 2nd and 3rd represent nucleotide codon position 1, 2 and 3 respectively. All the species that are represented are listed in Table 1. (TIF) Click here for additional data file.

Comparison of the total number of the amino acids of the mitogenomic PCGs across the nymphalid species.

All the species that are represented are listed in Table 1. (TIF) Click here for additional data file.

Predicted rrnL secondary structure in D. nesimachus mitogenome.

Roman numerals denote the conserved domain structure. Helices are shaded in grey. Tertiary structures are denoted by boxed bases joined by solid lines. Base-pairing is indicated as follows: Watson-Crick pairs by dashes, wobble GU pairs by red dots and other non-canonical pairs by green dots. (TIF) Click here for additional data file.

The secondary structures of H837/D13, 14 in rrnL gene from all Nymphalidae species used in this study.

Symbols are as in S4 Fig. (TIF) Click here for additional data file.

The secondary structures of H2077/G3 in rrnL gene from all Nymphalidae species used in this study.

Symbols are as in S4 Fig. (TIF) Click here for additional data file.

Materials and their sources in this study.

(PDF) Click here for additional data file.

Predicted initiation and termination codons distribution of the 13 mitochondrial PCGs among Nymphalidae species in this study.

(PDF) Click here for additional data file.
  52 in total

1.  An approximately unbiased test of phylogenetic tree selection.

Authors:  Hidetoshi Shimodaira
Journal:  Syst Biol       Date:  2002-06       Impact factor: 15.683

2.  Synergistic effects of combining morphological and molecular data in resolving the phylogeny of butterflies and skippers.

Authors:  Niklas Wahlberg; Michael F Braby; Andrew V Z Brower; Rienk de Jong; Ming-Min Lee; Sören Nylin; Naomi E Pierce; Felix A H Sperling; Roger Vila; Andrew D Warren; Evgueni Zakharov
Journal:  Proc Biol Sci       Date:  2005-08-07       Impact factor: 5.349

3.  Higher level phylogeny of Satyrinae butterflies (Lepidoptera: Nymphalidae) based on DNA sequence data.

Authors:  Carlos Peña; Niklas Wahlberg; Elisabet Weingartner; Ullasa Kodandaramaiah; Sören Nylin; André V L Freitas; Andrew V Z Brower
Journal:  Mol Phylogenet Evol       Date:  2006-03-24       Impact factor: 4.286

4.  The mitochondrial genome of the Korean hairstreak, Coreana raphaelis (Lepidoptera: Lycaenidae).

Authors:  I Kim; E M Lee; K Y Seol; E Y Yun; Y B Lee; J S Hwang; B R Jin
Journal:  Insect Mol Biol       Date:  2006-04       Impact factor: 3.585

5.  Sequence and organization of the human mitochondrial genome.

Authors:  S Anderson; A T Bankier; B G Barrell; M H de Bruijn; A R Coulson; J Drouin; I C Eperon; D P Nierlich; B A Roe; F Sanger; P H Schreier; A J Smith; R Staden; I G Young
Journal:  Nature       Date:  1981-04-09       Impact factor: 49.962

6.  Characterization of mitochondrial genome of Chinese wild mulberry silkworm, Bomyx mandarina (Lepidoptera: Bombycidae).

Authors:  Minhui Pan; Quanyou Yu; Yuling Xia; Fangyin Dai; Yanqun Liu; Cheng Lu; Ze Zhang; Zhonghuai Xiang
Journal:  Sci China C Life Sci       Date:  2008-08-03

7.  The complete mitochondrial genome of the fall webworm, Hyphantria cunea (Lepidoptera: Arctiidae).

Authors:  Fang Liao; Lin Wang; Song Wu; Yu-Ping Li; Lei Zhao; Guo-Ming Huang; Chun-Jing Niu; Yan-Qun Liu; Ming-Gang Li
Journal:  Int J Biol Sci       Date:  2010-03-29       Impact factor: 6.580

8.  Comparative mitogenomics of Braconidae (Insecta: Hymenoptera) and the phylogenetic utility of mitochondrial genomes with special reference to Holometabolous insects.

Authors:  Shu-jun Wei; Min Shi; Michael J Sharkey; Cornelis van Achterberg; Xue-xin Chen
Journal:  BMC Genomics       Date:  2010-06-11       Impact factor: 3.969

9.  The complete mitochondrial genomes of two ghost moths, Thitarodes renzhiensis and Thitarodes yunnanensis: the ancestral gene arrangement in Lepidoptera.

Authors:  Yong-Qiang Cao; Chuan Ma; Ji-Yue Chen; Da-Rong Yang
Journal:  BMC Genomics       Date:  2012-06-22       Impact factor: 3.969

10.  The complete mitochondrial genome of the bag-shelter moth Ochrogaster lunifer (Lepidoptera, Notodontidae).

Authors:  Paola Salvato; Mauro Simonato; Andrea Battisti; Enrico Negrisolo
Journal:  BMC Genomics       Date:  2008-07-15       Impact factor: 3.969

View more
  6 in total

1.  The complete mitochondrial genome of Vanessa indica and phylogenetic analyses of the family Nymphalidae.

Authors:  Youxue Lu; Naiyi Liu; Liuxiang Xu; Jie Fang; Shuyan Wang
Journal:  Genes Genomics       Date:  2018-06-14       Impact factor: 1.839

2.  New Complex of Cryptic Species Discovered in Genus Biblis (Papilionoidea: Nymphalidae: Biblidinae) in Mexico.

Authors:  Hugo Álvarez-García; Salima Machkour-M'Rabet; Armando Luis Martínez; Carmen Pozo
Journal:  Neotrop Entomol       Date:  2022-06-23       Impact factor: 1.650

3.  The complete mitochondrial genome of Damora sagana and phylogenetic analyses of the family Nymphalidae.

Authors:  Naiyi Liu; Na Li; Pengyu Yang; Chunqin Sun; Jie Fang; Shuyan Wang
Journal:  Genes Genomics       Date:  2017-10-17       Impact factor: 1.839

4.  Dated phylogeny and dispersal history of the butterfly subfamily Nymphalinae (Lepidoptera: Nymphalidae).

Authors:  Chengyong Su; Qinghui Shi; Xiaoyan Sun; Junye Ma; Chunxiang Li; Jiasheng Hao; Qun Yang
Journal:  Sci Rep       Date:  2017-08-18       Impact factor: 4.379

5.  Our love-hate relationship with DNA barcodes, the Y2K problem, and the search for next generation barcodes.

Authors:  Jeffrey M Marcus
Journal:  AIMS Genet       Date:  2018-01-17

6.  The complete mitogenome of Habropoda rodoszkowskii (Hymenoptera: Apidae) and phylogenetic analysis.

Authors:  Huanhuan Lu; Dunyuan Huang
Journal:  Mitochondrial DNA B Resour       Date:  2020-06-04       Impact factor: 0.658

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.