Literature DB >> 26788426

The complete mitochondrial genome of Lerema accius and its phylogenetic implications.

Qian Cong1, Nick V Grishin2.   

Abstract

Butterflies and moths (Lepidoptera) are becoming model organisms for genetics and evolutionary biology. Decoding the Lepidoptera genomes, both nuclear and mitochondrial, is an essential step in these studies. Here we describe a protocol to assemble mitogenomes from Next Generation Sequencing reads obtained through whole-genome sequencing and report the 15,338 bp mitogenome of Lerema accius. The mitogenome is AT-rich and encodes 13 proteins, 22 transfer-RNAs, and two ribosomal-RNAs, with a gene order typical for Lepidoptera mitogenomes. A phylogenetic study based on the protein sequences using both Bayesian Inference and Maximum Likelihood methods consistently place Lerema accius with other grass skippers (Hesperiinae).

Entities:  

Keywords:  Clouded Skipper; De novo assembly; Hesperiinae; Illumina sequencing

Year:  2016        PMID: 26788426      PMCID: PMC4715447          DOI: 10.7717/peerj.1546

Source DB:  PubMed          Journal:  PeerJ        ISSN: 2167-8359            Impact factor:   2.984


Introduction

The order Lepidoptera contains approximately 160,000 described and half a million estimated species (Kristensen, Scoble & Karsholt, 2007). It represents one of the most diverse and fascinating groups of insects with many species emerging as model organisms for genetics and evolution (Clarke & Sheppard, 1972; Nishikawa et al., 2013; Kunte et al., 2014; Zhan et al., 2011; Hines et al., 2012; Surridge et al., 2011; Engsontia et al., 2014; Zhang, Kunte & Kronforst, 2013). Studies of these model organisms benefit significantly from decoding the genomes of select Lepidoptera species. Recently, we published the genome draft of Clouded Skipper Lerema accius using next generation sequencing techniques (Cong et al., 2015). Traditional genome assemblers failed to automatically assemble the Clouded Skipper mitogenome together with the nuclear genome. This failure probably resulted from a difficulty in distinguishing the mitogenome NGS reads from those of nuclear genome as well as a high erroneous k-mers frequency due to high mitochondrial DNA coverage. However, a dedicated effort should allow assembly of the mitogenome from whole-genome sequencing reads. The insect mitogenome is circular, consisting of 14–19 kilobases (kb) that contain 13 protein-coding genes (PCGs), two ribosomal-RNA-coding genes (rRNAs), 22 transfer-RNA-coding genes (tRNAs), and an A + T rich displacement loop (D-loop) control region (Cameron, 2014). Because of their maternal inheritance, compact structure, lack of genetic recombination, and relatively fast evolutionary rate, mitogenomes have been used widely in molecular phylogenetics and evolution studies (Cameron, 2014; Moritz, Dowling & Brown, 1987). Here, we assemble and annotate the complete mitogenome of Lerema accius from next generation sequencing reads. Phylogenetic analyses using published mitogenomes of skipper butterflies (Hesperiidae) place Lerema accius among other grass skippers (Hesperiinae).

Methods

Library preparation and Illumina sequencing

We collected a male Lerema accius adult in the field (USA: Texas: Dallas County, Dallas, White Rock Lake, Olive Shapiro Park, 10-Nov-2013, GPS: 32.8621, −96.7305, elevation: 141 m) under permit #08-02Rev from Texas Parks and Wildlife Department (Natural Resources Program Director David H. Riskind). We removed the wings and abdomen of the deceased specimen (USA: Texas: Dallas County, Dallas, White Rock Lake, Olive Shapiro Park, 10-Nov-2013, GPS: 32.8621, −96.7305, elevation: 141 m), and used the remaining tissue to extract genomic DNA using the ChargeSwitch gDNA mini tissue kit (Life Technologies, Grand Island, NY, USA). About 500 ng of genomic DNA was used to prepare 250 bp and 500 bp paired-end libraries, respectively, following the Illumina TruSeq DNA sample preparation guide using enzymes from NEBNext Modules (New England Biolabs, Ipswich, MA, USA). These two libraries were pooled (and they occupied about 60% of one illumina lane) together with other libraries (not used for the mitogenome assembly) to sequence 150 bp from both ends with a rapid run on the Illumina HiSeq 2500 platform at the UT Southwestern Medical Center genomics core facility. The sequencing reads have been deposited in NCBI SRA database under accession numbers: SRR2089773– SRR2089775.

Mitogenome assembly

Sequencing reads were processed by MIRABAIT (Chevreux, Wetter & Suhai, 1999) to remove contamination from sequence adapters and trimmed low-quality regions (quality score <20) at both ends. Using the mitogenomes of four skippers (Carterocephalus silvicola, Potanthus flavus, Polytremis nascens and Polytremis jigongi) as references, we applied mitochondrial baiting and iterative mapping (MITObim) v1.6 (Hahn, Bachmann & Chevreux, 2013) to extract the sequencing reads of the mitogenome in the 250 bp and 500 bp libraries. About 1,161,000 reads (1.04% of all reads) were extracted using MITObim. Because the average size of the reference mitogenomes is 15,400 bp, we expected an average coverage of about 22,600 fold (1,161,000×150×2/15,400). We used JELLYFISH software (Marcais & Kingsford, 2011) to obtain the frequencies of 15-mers in these reads. The frequencies of some 15-mers were much lower than expected. They might come from regions in the mitogenome that were poorly covered in the sequencing reads; alternately, they might arise from sequencing errors, heterogeneity in different copies of mitochondrial DNA and reads from the nuclear genome. The second scenario could cause problems in de novo assembly, and thus we applied QUAKE (Kelley, Schatz & Salzberg, 2010) to correct errors in 15-mers with frequencies lower than 1,000 and excluded reads containing low-frequency 15-mers after error correction. We assembled the error-corrected reads into contigs de novo with Platanus (Kajitani et al., 2014). The contigs were further assembled into scaffolds using all the reads (including the ones containing 15-mers with frequencies lower than 1,000). This automatic procedure assembled a draft mitogenome of 15,332 bp without any gaps. However, since the genome assembler Platanus is not deigned to assemble circular mitogenomes, the linear representation of the circular DNA may either (1) miss a fragment after its 3′-terminus and before its 5′-terminus or (2) have redundant fragments that appear both at the 3′-terminus and 5′-terminus. We manually inspected the sequences at the 5′- and 3′-termini and revealed that there was no redundant fragment but instead a fragment of six base pairs was missing. We determined the sequence of the missing fragment by searching for the two 32 bp fragments at the 5′- and 3′-termini of the draft mitogenome in the sequencing reads and selected the sequence between them. A majority (99.8%) of the reads revealed the same missing fragment (others likely contained sequencing errors) and we manually added it into the mitogenome. We also adjusted the linear representation of the circular DNA by circular permutation so that the sequence started with the trnM(cau) gene, which was the convention for most Lepidoptera sequences deposited in the database.

Annotation and analysis of the mitochondrial genome

The mitogenome sequence was annotated using the MITOS web server (Bernt et al., 2013). We translated the sequences of PCGs to protein sequences using the genetic code for invertebrate mitogenomes. The predictions from MITOS were manually curated using other published skipper mitogenomes as references, and the starts and ends of genes were modified, if necessary, to be consistent with other species. The open reading frames (after modification) of the protein coding genes were validated. Secondary structures of tRNA genes were predicted using the same server.

Assembly quality assessment

We mapped the 250 bp and 500 bp paired-end reads to the mitogenome using bowtie2 v2.2.3 (Langmead & Salzberg, 2012) and processed the results with SAMtools (Li et al., 2009). Coverage depth at each position was calculated based on this mapping result. As the sequencing reads that could map partly to the 5′-terminus and partly to the 3′-terminus would map only to one terminus or fail to map, the coverage at the termini could be under-estimated. Therefore, we recalculated the coverage for the 1,000 bp segments in the 5′- and 3′-termini based on the mapping result to another linear representation of the circular mitogenome that was obtained by connecting the 5′-terminal half to the end of 3′-terminal half. Only two regions of the mitogenome showed coverage below 1,000 fold. One of them was a low complexity region that contained mostly (85%) T and another was a 46 bp fragment of AT repeats. Such AT-rich regions tend to be underrepresented in the sequencing libraries as they break easier during the library preparation (Benjamini & Speed, 2012). Manual inspection by searching the flanking regions of these poorly covered fragments in the sequencing reads revealed variation in the length of the poly-T sequence in the first fragment and the number AT-repeats in the second fragment, respectively. The variation might correspond to the heterogeneity in different copies of mitochondrial DNA in the specimen. We confirmed that the mitogenome produced by the de novo assembler did represent the dominant form of the possible variations. We further assessed the quality of our assembly by its consistency with other published skipper mitogenomes in the protein-, rRNA- and tRNA-coding regions. We aligned the rRNA- and tRNA-coding sequences directly and aligned translated sequences for PCGs. Alignments confirmed that our sequences were consistent with the majority of available mitogenomes, and gaps were only in regions that are poorly conserved among other skipper species. In addition, the COI barcode (5′-terminal region of cytochrome oxidase subunit 1 coding gene) of Lerema accius was reported previously (Genbank accession: GU088418.1) and this sequence agreed 100% with the corresponding region in our mitogenome. Notes. Identity: the lowest sequence identity to independently sequenced mitochondrial DNA of the same species in the Non-redundant database identified by BLAST. n.a.: there is no other mitochondrial sequences of the same species in the Genbank for cross-validation.

Phylogenetic analysis

The mitogenomes of 13 other skipper species that were available (up to June, 2015) were downloaded from NCBI (Table 1). Three moths from the Geometridae family and three species of the Papilionidae family were used as outgroups. A blast search against all the available sequences of the same species in the non-redundant database was used to validate each mitogenome sequence. For most species, some individual genes in the mitogenome were sequenced independently and the mitogenome sequence was consistent with these gene sequences (sequence identity >95%). However, four skippers were excluded from downstream analyses (Ampittia dioscorides, Choaspes benjaminii, Ochlodes venata and Polytremis nascens, three of which are unpublished but available from GenBank) due to poor agreement for at least one gene sequence found in GenBank.
Table 1

List of taxa analyzed in present paper.

SpeciesLengthIdentityaAccessionReferences
Ampittia dioscorides15,31391.2%KM102732.1XW Yang et al., 2014, unpublished data
Apocheima cinerarium15,722n.a.NC_024824.1Liu et al. (2014)
Biston suppressaria15,62899.7%NC_027111.1Chen et al. (2015)
Carterocephalus silvicola15,76599.0%NC_024646.1Kim et al. (2014)
Celaenorrhinus maculosa15,282n.a.NC_022853.1Wang, Hao & Zhao (2013)
Choaspes benjaminii15,30085.9%NC_024647.1Kim et al. (2014)
Ctenoptilum vasava15,468n.a.NC_016704.1Hao et al. (2012)
Daimio tethys15,35096.8%NC_024648.1Kim et al. (2014)
Erynnis montanus15,53099.8%NC_021427.1Wang et al. (2014)
Graphium timur15,22697.9%NC_024098.1Chen et al., (2014)
Hasora anura15,290n.a.NC_027263.1Wang et al. (2015)
Lobocla bifasciata15,36695.9%NC_024649.1Kim et al. (2014)
Ochlodes venata15,62278.5%NC_018048.1C Xu et al., 2012, unpublished data
Papilio glaucus15,30699.5%NC_027252.1Shen, Cong & Grishin (2015) and Cong et al., (2015)
Parnassius apollo15,40498.6%NC_024727.1Kim et al. (2009)
Phthonandria atrilineata15,49999.9%NC_010522.1Yang et al. (2009)
Polytremis jigongi15,35399.6%NC_026990.1Jiang et al. (2015)
Polytremis nascens15,39283.6%NC_026228.1Jiang et al. (2015)
Potanthus flavus15,26799.3%NC_024650.1Kim et al., (2014)

Notes.

Identity: the lowest sequence identity to independently sequenced mitochondrial DNA of the same species in the Non-redundant database identified by BLAST.

n.a.: there is no other mitochondrial sequences of the same species in the Genbank for cross-validation.

Coverage and annotation of Lerema acciusmitogenome.

The same base positions are aligned between (A) and (B). (A) Coverage by sequencing reads at each base position. (B) Map of genes in the Lerema accius mitogenome. PCGs are colored in red, tRNA-coding genes are in blue, rrnL and rrnS are in green. Each gene is shown as an arrow indicating the transcription direction. The arrows on top of the black line correspond to genes coded on the majority strand, and those below show genes on the minority strand. Protein sequences of the 13 protein-coding genes were aligned by MAFFT. We manually checked the alignments, corrected annotation errors based on consensus and removed positions with long gaps and their surrounding regions with uncertain alignment. The processed alignments were concatenated and analyzed with Bayesian Inference and Maximum likelihood methods using Phylobayes-MPI v1.5a (Lartillot, Lepage & Blanquart, 2009) (model: CATGTR (Lartillot & Philippe, 2004)) and RaxML v8.1.17 (Stamatakis, 2014) (model: PROTGAMMAAUTO), respectively. The resulting phylogenetic trees were visualized in FigTree v1.4.2.

Results and Discussion

Annotation of the mitogenome

The complete mitogenome of Lerema accius is deposited in GenBank of NCBI under accession number KT598278. The length of this mitogenome is 15,338 bp and it retains the typical insect mitogenome gene set and gene order, including 13 PCGs (nd1-6, nd4l, cox1-3, atp8, atp6, and cytb), 22 tRNA genes (two for Serine and Leucine and one for each of the rest of the amino acids), 2 ribosomal RNAs (rrnL and rrnS), and an A + T rich D-loop control region. The annotation of the mitogenome is illustrated in Fig. 1. The cox1 gene uses start codon CGA, which is consistent with many other insect mitogenomes (Kim et al., 2009). All the rest of the genes start with the typical ATN. cox1, cox2 and nd4 use an incomplete stop codon T (Ojala, Montoya and Attardi, 1981), and a complete TAA codon will likely be formed during mRNA maturation (Ojala, Montoya and Attardi, 1981; Boore, 1999).
Figure 1

Coverage and annotation of Lerema acciusmitogenome.

The same base positions are aligned between (A) and (B). (A) Coverage by sequencing reads at each base position. (B) Map of genes in the Lerema accius mitogenome. PCGs are colored in red, tRNA-coding genes are in blue, rrnL and rrnS are in green. Each gene is shown as an arrow indicating the transcription direction. The arrows on top of the black line correspond to genes coded on the majority strand, and those below show genes on the minority strand.

Secondary structure of 22 tRNAs encoded by the Lerema accius mitogenome.

The tRNAs are labeled by the abbreviations of their corresponding amino acids. The lengths of tRNA-coding genes range from 60 bp to 70 bp. Secondary structures predicted by MITOS suggest that all tRNAs adopt a typical cloverleaf structure except for trnS1(gcu) (Fig. 2). The dihydrouridine (DHU) arm of trnS1(gcu) does not form a stable stem-loop structure, which is very common in butterfly mitogenomes (Lu et al., 2013; Kim et al., 2014). A 488 bp A + T rich region (A + T content: 94.7%) connects rrnS and trnM(cau). This region contains an “ATAGA” motif located 22 bp downstream from rrnS and is followed by 15 bp of poly-T stretch that is a gene regulation element commonly found in Lepidoptera (Lu et al., 2013; Salvato et al., 2008).
Figure 2

Secondary structure of 22 tRNAs encoded by the Lerema accius mitogenome.

The tRNAs are labeled by the abbreviations of their corresponding amino acids.

We built a phylogenetic tree of the 10 skipper species with published mitogenomes, based on the concatenated alignment of the mitochondrial protein sequences. Three Papilionidae and three Geometridae mitogenomes were used as outgroups. Maximum likelihood method RAxML automatically selected MTZOA, a general mitochondrial amino acid substitution model, as the most appropriate, and placed Lerema accius among other grass-skippers (Subfamily Hesperiinae). A Bayesian analysis with the CATGTR model supported a tree with exactly the same topology. This topology is largely consistent with previously reported phylogenetic studies on the basis of standard gene markers and morphology (Warren, Ogawa and Brower, 2008; Warren, Ogawa and Brower, 2009; Yuan et al., 2015). Notably, the subfamily Coeliadinae (represented by Hasora anura) is a sister to all other Hesperiidae. Topology between the subfamilies Eudaminae (Lobocla bifasiatus), Pyrginae (other branches shown in green in Fig. 3) and remaining Skippers is unresolved. The tribes Celaenorrhini (Celaenorrhinus maculosa) and Tagiadini (Daimio and Ctenoptilum) group together (in the absence of Pyrrhopygini). The subfamily Heteropterinae (represented by Carterocephalus silvicola) is a sister to grass skippers (Hesperiinae).
Figure 3

Phylogeny of skippers based on the concatenated alignment of the mitochondrial protein sequences.

(A) Consensus of phylogenetic trees by RAxML (MTZOA model) based on bootstrap samples of the alignment. (B) Consensus of phylogenetic trees sampled by Phylobayes-MPI (v1.5a) with CATGTR model.

Phylogeny of skippers based on the concatenated alignment of the mitochondrial protein sequences.

(A) Consensus of phylogenetic trees by RAxML (MTZOA model) based on bootstrap samples of the alignment. (B) Consensus of phylogenetic trees sampled by Phylobayes-MPI (v1.5a) with CATGTR model. Interestingly, based on the mitochondrial genome, the two Asian grass skippers (Potanthus from the tribe Taractrocerini and Polytremis from the tribe Baorini) are grouped together, and Lerema (from the tribe Moncini) is their sister. The sequences of two nuclear markers, EF1a and wingless, are available from these species in the database; however, they support different topologies at low confidence. While the maximal likelihood tree based on EF1a favors (bootstrap: 52%) the same topology as the mitogenome, the tree based on wingless groups Potanthus with Lerema with a 63% bootstrap support and places Polytremis as their sister. The phylogeny between these tribes could become clear when more sequences from more taxa become available.
  38 in total

1.  Characterization of complete mitochondrial genome of the skipper butterfly, Celaenorrhinus maculosus (Lepidoptera: Hesperiidae).

Authors:  Kai Wang; Jiasheng Hao; Huabin Zhao
Journal:  Mitochondrial DNA       Date:  2013-10-09

2.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

3.  Complete mitochondrial DNA genome of Polytremis nascens (Lepidoptera: Hesperiidae).

Authors:  Weibin Jiang; Jianqing Zhu; Qichang Yang; Huidong Zhao; Minghan Chen; Haiyan He; Weidong Yu
Journal:  Mitochondrial DNA A DNA Mapp Seq Anal       Date:  2015-02-18       Impact factor: 1.514

4.  The complete mitochondrial genome of the mountainous duskywing, Erynnis montanus (Lepidoptera: Hesperiidae): a new gene arrangement in Lepidoptera.

Authors:  Ah Rha Wang; Heon Cheon Jeong; Yeon Soo Han; Iksoo Kim
Journal:  Mitochondrial DNA       Date:  2013-04-16

5.  MITOS: improved de novo metazoan mitochondrial genome annotation.

Authors:  Matthias Bernt; Alexander Donath; Frank Jühling; Fabian Externbrink; Catherine Florentz; Guido Fritzsch; Joern Pütz; Martin Middendorf; Peter F Stadler
Journal:  Mol Phylogenet Evol       Date:  2012-09-07       Impact factor: 4.286

6.  doublesex is a mimicry supergene.

Authors:  K Kunte; W Zhang; A Tenger-Trolander; D H Palmer; A Martin; R D Reed; S P Mullen; M R Kronforst
Journal:  Nature       Date:  2014-03-05       Impact factor: 49.962

7.  Quake: quality-aware detection and correction of sequencing errors.

Authors:  David R Kelley; Michael C Schatz; Steven L Salzberg
Journal:  Genome Biol       Date:  2010-11-29       Impact factor: 13.583

8.  Phylogenetic relationships of subfamilies in the family Hesperiidae (Lepidoptera: Hesperioidea) from China.

Authors:  Xiangqun Yuan; Ke Gao; Feng Yuan; Ping Wang; Yalin Zhang
Journal:  Sci Rep       Date:  2015-06-10       Impact factor: 4.379

9.  The complete mitochondrial genome of the bag-shelter moth Ochrogaster lunifer (Lepidoptera, Notodontidae).

Authors:  Paola Salvato; Mauro Simonato; Andrea Battisti; Enrico Negrisolo
Journal:  BMC Genomics       Date:  2008-07-15       Impact factor: 3.969

10.  Molecular basis of wing coloration in a Batesian mimic butterfly, Papilio polytes.

Authors:  Hideki Nishikawa; Masatoshi Iga; Junichi Yamaguchi; Kazuki Saito; Hiroshi Kataoka; Yutaka Suzuki; Sumio Sugano; Haruhiko Fujiwara
Journal:  Sci Rep       Date:  2013-11-11       Impact factor: 4.379

View more
  15 in total

1.  Phylogenetic Implication of Large Intergenic Spacers: Insights from a Mitogenomic Comparison of Prosopocoilus Stag Beetles (Coleoptera: Lucanidae).

Authors:  Mengqiong Xu; Shiju Zhou; Xia Wan
Journal:  Animals (Basel)       Date:  2022-06-21       Impact factor: 3.231

2.  The mitogenome of a Malagasy butterfly Malaza fastuosus (Mabille, 1884) recovered from the holotype collected over 140 years ago adds support for a new subfamily of Hesperiidae (Lepidoptera).

Authors:  Jing Zhang; David C Lees; Jinhui Shen; Qian Cong; Blanca Huertas; Geoff Martin; Nick V Grishin
Journal:  Genome       Date:  2020-03-06       Impact factor: 2.166

3.  Next-generation sequencing of mixed genomic DNA allows efficient assembly of rearranged mitochondrial genomes in Amolops chunganensis and Quasipaa boulengeri.

Authors:  Siqi Yuan; Yun Xia; Yuchi Zheng; Xiaomao Zeng
Journal:  PeerJ       Date:  2016-12-15       Impact factor: 2.984

4.  The complete mitogenome of Achalarus lyciades (Lepidoptera: Hesperiidae).

Authors:  Jinhui Shen; Qian Cong; Nick V Grishin
Journal:  Mitochondrial DNA B Resour       Date:  2016-08-30       Impact factor: 0.658

5.  Mitogenomes of Giant-Skipper Butterflies reveal an ancient split between deep and shallow root feeders.

Authors:  Jing Zhang; Qian Cong; Xiao-Ling Fan; Rongjiang Wang; Min Wang; Nick V Grishin
Journal:  F1000Res       Date:  2017-03-06

6.  Complete mitochondrial genomes of three skippers in the tribe Aeromachini (Lepidoptera: Hesperiidae: Hesperiinae) and their phylogenetic implications.

Authors:  Xiangyu Hao; Jiaqi Liu; Hideyuki Chiba; Jintian Xiao; Xiangqun Yuan
Journal:  Ecol Evol       Date:  2021-05-18       Impact factor: 2.912

7.  The complete mitochondrial genome of a skipper Burara striata (Lepidoptera: Hesperiidae).

Authors:  Jing Zhang; Qian Cong; Jinhui Shen; Rongjiang Wang; Nick V Grishin
Journal:  Mitochondrial DNA B Resour       Date:  2017-03-10       Impact factor: 0.658

8.  Mitochondrial genomes of three Tetrigoidea species and phylogeny of Tetrigoidea.

Authors:  Li-Liang Lin; Xue-Juan Li; Hong-Li Zhang; Zhe-Min Zheng
Journal:  PeerJ       Date:  2017-11-15       Impact factor: 2.984

9.  A new strategy to infer circularity applied to four new complete frog mitogenomes.

Authors:  Denis Jacob Machado; Daniel Janies; Cory Brouwer; Taran Grant
Journal:  Ecol Evol       Date:  2018-03-25       Impact factor: 2.912

10.  The complete mitochondrial genomes of two skipper genera (Lepidoptera: Hesperiidae) and their associated phylogenetic analysis.

Authors:  Yuke Han; Zhenfu Huang; Jing Tang; Hideyuki Chiba; Xiaoling Fan
Journal:  Sci Rep       Date:  2018-10-25       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.