Literature DB >> 35789903

Complete mitogenome data from a summer population specimen of the urticating pine defoliator Thaumetopoea pityocampa (Lepidoptera, Notodontidae, Thaumetopoeinae, Thaumetopoea).

Bernhard Gschloessl1, Carole Kerdelhué1.   

Abstract

We present a de novo mitogenome assembly obtained from specimens sampled in the so-called summer population (SP) of Thaumetopoea pityocampa (Denis and Schiffermüller, 1775) in Portugal. Contrary to the typical larval development occurring in winter in this species, the larvae of this unique population develop during summer. The sequencing data used were obtained from genomic libraries originally generated to assemble the nuclear genome of T. pityocampa[1]. We also provide a complete annotation and a phylogenetic representation which positions the Portuguese summer population of T. pityocampa and an Italian typical individual of the same species among the Notodontidae family and more distant Noctuoidea species. This data represents a valuable new resource for an expanding and urticating insect pest.
© 2022 The Authors.

Entities:  

Keywords:  Annotation; Assembly; Lepidoptera; Mitogenome; Notodontidae; Phylogeny; Thaumetopoeinae

Year:  2022        PMID: 35789903      PMCID: PMC9249620          DOI: 10.1016/j.dib.2022.108376

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

FASTA: mitochondrial genome assembly TABLE: gene annotations, base composition, Lepidoptera species used for phylogenetic analysis FIGURE: mitogenomic circular map, AT-rich region sequence, phylogenetic tree

Value of the Data

This data represents the mitogenome of a phenologically shifted specimen of Thaumetopoea pityocampa collected in Portugal. The present data can be used for phylogenetic studies of Lepidoptera especially for getting deeper insights in diversification dynamics of Notodontidae species. The resource will be helpful to study the evolution of the Thaumetopoea genus and the T. pityocampa / T. wilkinsoni species complex.

Data Description

The T. pityocampa Summer Population (“TpitSP”) mitogenome presented here had an A + T content of 77.8% and a size of 15,729 bp. All 38 genetic regions reported for arthropod mitogenomes [2], were identified (Supplementary GFF annotation file): 13 protein-coding genes, 22 tRNA and 2 rRNA (12S rRNA and 16S rRNA) genes, and one AT-rich control region. Genes were ordered identically to the Italian T. pityocampa [3] and other Notodontidae [4] mitogenomes. Start and stop codons of the protein-coding genes corresponded also to those observed in other Noctuoidea. In total, 22 intergenic spacer regions (see Table 1: positive intergenic nucleotide values) with an average size of 20 bp and a total length of 437 bp were identified. These regions ranged from 1 to 96 bp. The longest intergenic spacer was located between the ND4 and ND4L genes. Furthermore, three gene pairs showed overlaps (see Table 1: negative intergenic nucleotide values) of 1 to 8 bp. All annotation details are given in Table 1. In Fig. 1 the TpitSP mitogenome genetic regions are depicted as a circular map. Table 2 further shows type base compositions, AT and GC skews for each annotation type. The typical clover-leaf secondary structures can be observed for all tRNA except trnS1 in which the dihydrouridine (DHU) arm was replaced by an unstable loop (Supplementary Fig. 1). Fig. 2 shows characteristics of the AT-rich region. At last, Fig. 3 and Supplementary Fig. 2 show phylogenies built on the mitogenomes of SP, 30 other Noctuoidea and two Geometridae species representing the outgroup (Table 3).
Table 1

Gene features of the SP T. pityocampa mitogenome.

Gene (anticodon)Start0StrandSizeIntergenic nucleotidesStart/stop codon
trnM(cat)168+680
trnI(gat)69135+672
trnQ(ttg)138206-6963
ND22701,283+1,0142ATT/TAA
trnW(tca)1,2861,351+66-8
trnC(gca)1,3441,408-6510
trnY(gta)1,4191,484-664
COX11,4893,021+1,5332CGA/TAA
trnL2(taa)3,0243,093+702
COX23,0963,774+6790ATG/ATT
trnK(ctt)3,7753,845+7110
trnD(gtc)3,8563,931+760
ATP83,9324,102+171-7ATT/TAA
ATP64,0964,776+68128ATG/TAA
COX34,8055,593+7897ATG/TAA
trnG(tcc)5,6015,675+750
ND35,6766,029+35464ATT/TAA
trnA(tgc)6,0946,162+691
trnR(tcg)6,1646,230+6713
trnN(gtt)6,2446,312+6924
trnS1(gct)6,3376,402+6662
trnE(ttc)6,4656,534+706
trnF(gaa)6,5416,612-723
ND56,6168,364-1,7490ATT/TAA
trnH(gtg)8,3658,430-660
ND48,4319,769-1,33996ATG/TTT
ND4L9,86610,156-2915ATG/TAA
trnT(tgt)10,16210,225+640
trnP(tgg)10,22610,290-658
ND610,29910,829+531-1ATG/TAA
CYTB10,82911,983+1,1553ATG/TAA
trnS2(tga)11,98712,053+6722
ND112,07613,011-9360GTG/TAA
trnL1(tag)13,01213,082-710
16S rRNA13,08314,463-1,3810
trnV(tac)14,46414,531-680
12S rRNA14,53215,309-7780
AT-rich region15,31015,729+4200

The gene names, the coordinates, the orientation, the size, the gene overlaps/intergenic spacers (column 6) and the start/stop codons of the SP mitogenome are listed.

Fig. 1

Map of the T. pityocampa SP mitogenome generated with OGDRAW. Annotated genes are highlighted.

Table 2

Base composition and skewness indices for each gene region of the SP T. pityocampa mitogenome.

TypeSize [bp]A%G%T%C%A+T%ATskewnessGCskewness
Entire mitogenome15,72940.67.637.314.677.80.042-0.318
Protein-coding genes11,22232.612.143.012.475.5-0.138-0.009
tRNA1,50741.411.339.37.980.80.0250.179
rRNA2,15939.213.143.14.682.3-0.0480.482
AT-rich region42043.62.950.03.693.6-0.069-0.111
Fig. 2

Characteristics of the AT-rich region of the T. pityocampa SP mitogenome.

Fig. 3

Phylogenetic tree built on concatenated alignments of the nucleotide sequences of the 13 protein-coding and both rRNA genes applying a Maximum Likelihood approach with PHYML. For both T. pityocampa populations the countries are given.

Table 3

Lepidoptera mitogenomes used to build the phylogenetic trees. For both T. pityocampa populations the countries are given.

Super-familyFamilySubfamilySpeciesGenbank no.
GeometroideaGeometridaeEnnominaeBiston panterinariaNC_020004
GeometroideaGeometridaeEnnominaePhthonandria atrilineataNC_010522
NoctuoideaErebidaeAganainaeAsota plana lacteataKJ173908
NoctuoideaErebidaeArctiinaeAmata formosaeNC_021416
NoctuoideaErebidaeArctiinaeCallimorpha dominulaNC_027094
NoctuoideaErebidaeArctiinaeLemyra melliNC_026692
NoctuoideaErebidaeArctiinaeVamuna virilisNC_026844
NoctuoideaErebidaeErebinaeGrammodes geometricaNC_039810
NoctuoideaErebidaeHerminiinaeHydrillodes lentalisNC_041106
NoctuoideaErebidaeHypeninaeParagabara curvicornutaNC_034638
NoctuoideaErebidaeLymantriinaeEuproctis pseudoconspersaNC_027145
NoctuoideaErebidaeLymantriinaeGynaephora aureataNC_029162
NoctuoideaErebidaeLymantriinaeLachana alpherakiiNC_029172
NoctuoideaErebidaeLymantriinaeLymantria disparNC_012893
NoctuoideaEuteliidaeEuteliinaeEutelia adulatricoidesNC_026840
NoctuoideaNoctuidaeHeliothinaeHelicoverpa armigeraNC_014668
NoctuoideaNoctuidaeNoctuinaeAgrotis ipsilonNC_022185
NoctuoideaNoctuidaeNoctuinaeMythimna separataNC_023118
NoctuoideaNoctuidaeNoctuinaeSesamia inferensNC_015835
NoctuoideaNoctuidaeNoctuinaeSpodoptera frugiperdaNC_027836
NoctuoideaNoctuidaeNoctuinaeStriacosta albicostaNC_025774
NoctuoideaNoctuidaePlusiinaeCtenoplusia agnataNC_021410
NoctuoideaNoctuidaePlusiinaeDiachrysia nadejaNC_053855
NoctuoideaNoctuidaePlusiinaeTrichoplusia niNC_045936
NoctuoideaNolidaeChloephorinaeGabala argentataNC_026842
NoctuoideaNolidaeChloephorinaeSinna extremaNC_045120
NoctuoideaNolidaeRisobinaeRisoba prominensNC_026841
NoctuoideaNotodontidaePhalerinaePhalera flavescensNC_016067
NoctuoideaNotodontidaePygaerinaeClostera anachoretaNC_034740
NoctuoideaNotodontidaePygaerinaeClostera anastomosisNC_041140
NoctuoideaNotodontidaeThaumetopoeinaeOchrogaster luniferNC_011128
NoctuoideaNotodontidaeThaumetopoeinaeThaumetopoea pityocampa (Italy)NC_053256
NoctuoideaNotodontidaeThaumetopoeinaeThaumetopoea pityocampa (Portugal)OM478592
Gene features of the SP T. pityocampa mitogenome. The gene names, the coordinates, the orientation, the size, the gene overlaps/intergenic spacers (column 6) and the start/stop codons of the SP mitogenome are listed. Map of the T. pityocampa SP mitogenome generated with OGDRAW. Annotated genes are highlighted. Base composition and skewness indices for each gene region of the SP T. pityocampa mitogenome. Characteristics of the AT-rich region of the T. pityocampa SP mitogenome. Phylogenetic tree built on concatenated alignments of the nucleotide sequences of the 13 protein-coding and both rRNA genes applying a Maximum Likelihood approach with PHYML. For both T. pityocampa populations the countries are given. Lepidoptera mitogenomes used to build the phylogenetic trees. For both T. pityocampa populations the countries are given.

Experimental Design, Materials and Methods

DNA sequence libraries

To generate the present SP mitogenome assembly we used two shotgun whole-genome paired-end sequence libraries originally built to produce the reference nuclear genome of T. pityocampa SP [1]. Each of these libraries was built from one SP male which was sampled between 2010 and 2012 from the Mata Nacional de Leiria in Portugal (39°47’N, 8°58’W). The libraries had insert sizes of 300 bp and 600 bp, respectively, and were sequenced on an Illumina HiSeq2000 platform at the Edinburgh Genomics sequencing facility in Scotland, UK. The raw whole-genome shot-gun sequence reads were trimmed and cleaned with Trimmomatic v0.22 [5] (specific parameters were: window size of 20 bp, minimum PHRED quality of 20, minimum length of 70 bp), and subsequently assembled by Eurofins MWG Operon using their own bioinformatics pipeline (CONVEY, http://www.conveycomputer.com), implementing multiple Velvet assemblies.

Mitogenome assembly and feature analyses

The SP mitogenome was generated in two subsequent steps. First, as described in [1] the nuclear genome assembly contained one scaffold of 15,717 bp which was identified as partial mitochondrion based on blastn searches against the NCBI nucleotide database. In a second step, MITObim v1.9 [6] was applied to extend and refine this mitogenome backbone using the cleaned (--clean) Illumina paired-end (--pair) reads with a k-mer size of 41 bp (--kbait 41), a maximum of 5 mismatches per mapped read (--mismatch 5) and a minimum average contig coverage of 500 reads per base (--min_cov 500). Furthermore, the PPM mitogenome was globally aligned with MAFFT v7.471 [7] (executable mafft-ginsi, using a maximum of 1,000 refinement iterations (--maxiterate 1000)) to five other Notontidae mitogenomes of the following species: Thaumetopoea pityocampa (Italian population, Genbank no. NC_053256), Ochrogaster lunifer (Genbank no. NC_011128), Clostera anastomosis (Genbank no. NC_041140), Clostera anachoreta (Genbank no. NC_034740), Phalera flavescens (Genbank no. NC_016067). This allowed to identify overlapping repeat regions at both sequence termini which were then identified as assembly artefacts and thus manually removed (linear vs circular structure). This resulted in a mitogenome size of 15,729 bp. Annotation of the genetic regions was conducted by the MITOS web server v2 [8], using as reference set ‘RefSeq 89 Metazoa‘ and the genetic code ‘5‘ for invertebrates. Furthermore, the MITOS gene annotations were manually curated by aligning the predicted genes via web service of NCBI blastn [9] (with default parameters, version of January 2022) to gene annotations of other Notodontidae mitogenomes (same as described above). The web tool OGDRAW v1.3.1 [10] was run to create the circular map of the SP T. pityocampa mitogenome with its annotations. We applied the wordcount program of the EMBOSS toolkit v6.6.0.0 [11] to calculate the base content of each genetic region type. The genomic compositional asymmetry of the SP mitogenome of each specific annotation type region was described by the AT/GC skewness indices which were calculated as follows with each base count being represented by the corresponding base letter: AT skew = [A-T] / [A + T] and GC skew = [G-C] / [G + C].

Phylogenetic analyses

For phylogenetic trees were built on 33 Lepidoptera mitogenomes – including 30 Noctuoidea species (Table 3). Two representatives of the Geometroidea were used as outgroup to root the trees following the conclusions of Wahlberg et al. [12]. We obtained the nucleotidic sequences of the 13 PCGs and 2 rRNA genes from the NCBI ‘nucleotide’ database (https://www.ncbi.nlm.nih.gov/nuccore, July 2021). The superfamily Noctuoidea was chosen for taxon sampling, encompassing five species belonging to the Notodontidae (including T. pityocampa). Regarding T. pityocampa, we included the mitogenomes of the Portuguese TpitSP (provided here) and an Italian typical population [3]. The nucleotide sequences of all species were grouped into 15 gene-specific sequence sets, corresponding to 13 PCG and 2 rRNA genes. Global sequence alignment on each gene set using the Needleman-Wunsch algorithm was conducted with the program MAFFT v7.471 (executable mafft-ginsi for PCGs and mafft-qinsi for rRNA sequences), applying a maximum of 1,000 refinement iterations (--maxiterate 1000). Finally, in-house Perl and Python and scripts were run to concatenate for each species all gene-specific alignments following the same gene order. A local installation of MrBayes v3.2.7a [13] and the PHYML web server v3.0 [14] were run to calculate phylogenetic tree structures. For each gene a distinct partition was configured in the MrBayes analysis with the following parameters each: GTR substitution model (nst = 6) with gamma-distributed rate variation (rates = invgamma) across sites and a proportion of invariable sites (GTR + I + #). As substitution model the standard nucleotide model (nucmodel = 4by4) was chosen with priors being left to default settings. We configured MrBayes to use 100,000 samples (ngen = 1,000,000, samplefreq = 10) in order to generate phylogenetic trees. Additional settings were: printfreq = 100, diagnfreq = 1,000, nchains = 4, savebrlens = yes, starttree = random, startparams = reset, outgroup bipan (corresponding to Biston panterinaria), sump burnin = 25,000, sumt burnin = 25,000. Regarding the PHYML analysis, the Smart Model Selection [15] parameter was chosen with the Akaike Information Criterion and the tree searching method BIONJ being set. Furthermore, 1,000 bootstraps were applied. Finally, the program FigTree v1.4.4 [16] was run to draw the phylogenetic tree.

CRediT Author Statement

Bernhard Gschloessl: Conceptualization, Methodology, Formal analysis, Validation, Writing – review & editing; Carole Kerdelhué: Project administration, Funding acquisition, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
SubjectOmics: Genomics
Specific subject areaLepidoptera, Notodontidae, Mitogenomics
Type of data

FASTA: mitochondrial genome assembly

TABLE: gene annotations, base composition, Lepidoptera species used for phylogenetic analysis

FIGURE: mitogenomic circular map, AT-rich region sequence, phylogenetic tree

How the data were acquiredTwo whole-genome Illumina shotgun libraries were sequenced on an Illumina HiSeq2000 platform using the paired-end protocol (2 × 100 base pairs (bp)).
Data formatAnalysed: de novo assembled and annotated mitogenome
Description of data collectionMITObim v1.9 assembled genomic libraries generated from male individuals. Genomic regions were annotated with MITOS v2 and represented as circular mitogenome map with OGDRAW. Multiple sequence alignments were generated with MAFFT v7.471 and subsequently concatenated to build phylogenetic trees with PHYML v3.0 and MrBayes v3.2.7a. Trees were visualized with FigTree v1.4.4.
Data source locationInstitution: INRAECity/Town/Region: Mata Nacional de LeiriaCountry: PortugalLatitude and longitude (and GPS coordinates, if possible) for collected samples/data: 39°47’N, 8°58’W
Data accessibilityRepository name: NCBI BioProjectData identification number: PRJNA344465Direct URL to data: http://www.ncbi.nlm.nih.gov/bioproject/344465Repository name: NCBI BioSampleData identification number: SAMN24475646Direct URL to data: https://www.ncbi.nlm.nih.gov/biosample/24475646Repository name: NCBI GenbankData identification number: OM478592Direct URL to data: https://www.ncbi.nlm.nih.gov/nuccore/OM478592
  15 in total

1.  EMBOSS: the European Molecular Biology Open Software Suite.

Authors:  P Rice; I Longden; A Bleasby
Journal:  Trends Genet       Date:  2000-06       Impact factor: 11.639

Review 2.  Animal mitochondrial genomes.

Authors:  J L Boore
Journal:  Nucleic Acids Res       Date:  1999-04-15       Impact factor: 16.971

3.  The complete mitochondrial genome of Clostera anastomosis (Lepidoptera: Notodontidae) and implication for the phylogenetic relationships of Noctuoidea species.

Authors:  Xiao-Yu Zhu; Zhao-Zhe Xin; Yu Liu; Ying Wang; Yan Huang; Zhi-Hui Yang; Xiao-Hua Chu; Dai-Zhen Zhang; Hua-Bin Zhang; Chun-Lin Zhou; Jia-Lian Wang; Bo-Ping Tang; Qiu-Ning Liu
Journal:  Int J Biol Macromol       Date:  2018-07-04       Impact factor: 6.953

4.  Draft genome and reference transcriptomic resources for the urticating pine defoliator Thaumetopoea pityocampa (Lepidoptera: Notodontidae).

Authors:  B Gschloessl; F Dorkeld; H Berges; G Beydon; O Bouchez; M Branco; A Bretaudeau; C Burban; E Dubois; P Gauthier; E Lhuillier; J Nichols; S Nidelet; S Rocha; L Sauné; R Streiff; M Gautier; C Kerdelhué
Journal:  Mol Ecol Resour       Date:  2018-02-12       Impact factor: 7.090

5.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

6.  MITOS: improved de novo metazoan mitochondrial genome annotation.

Authors:  Matthias Bernt; Alexander Donath; Frank Jühling; Fabian Externbrink; Catherine Florentz; Guido Fritzsch; Joern Pütz; Martin Middendorf; Peter F Stadler
Journal:  Mol Phylogenet Evol       Date:  2012-09-07       Impact factor: 4.286

7.  Identification and analysis of the complete mitochondrial genome of Thaumetopoea pityocampa (Lepidoptera: Notodontidae).

Authors:  Kai Wu; Jinge Yang; Yuyang Ni; Qiuning Liu
Journal:  Mitochondrial DNA B Resour       Date:  2019-10-21       Impact factor: 0.658

8.  NCBI BLAST: a better web interface.

Authors:  Mark Johnson; Irena Zaretskaya; Yan Raytselis; Yuri Merezhuk; Scott McGinnis; Thomas L Madden
Journal:  Nucleic Acids Res       Date:  2008-04-24       Impact factor: 16.971

9.  Timing and patterns in the taxonomic diversification of Lepidoptera (butterflies and moths).

Authors:  Niklas Wahlberg; Christopher W Wheat; Carlos Peña
Journal:  PLoS One       Date:  2013-11-25       Impact factor: 3.240

10.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.