| Literature DB >> 35141362 |
Marylin Miga1, Puteri Nur Syahzanani Jahari1, Chan Vei Siang2, Kamarul Rahim Kamarudin3, Mohd Shahir Shamsir3, Lili Tokiman4, Sivachandran Parimannan5,6, Heera Rajandas5,6, Farhan Mohamed2, Faezah Mohd Salleh1,5.
Abstract
Here, we present the complete mitochondrial genome of Pachliopta aristolochiae, a Common Rose butterfly from Malaysia. The sequence was generated using Illumina NovaSeq 6000 sequencing platform. The mitogenome is 15,235bp long, consisting of 13 protein-coding genes, 22 transfer RNAs, two ribosomal RNAs, and two D-loop regions. The total base composition was (81.6%), with A (39.3%), T (42.3%), C (11.0%) and G (7.3%). The gene order of the three tRNAs was trnM-trnI-trnQ, which differs from the ancestral insect gene order trnI-trnQ-trnM. Phylogenetic tree analysis revealed that the sequenced Pachliopta aristolochiae in this data is closely related to Losaria neptunus (NC 037868), with highly supported ML and BI analysis. The data presented in this work can provide useful resources for other researchers to study deeper into the phylogenetic relationships of Lepidoptera and the diversification of the Pachliopta species. Also, as one of the bioindicator species, this data can be used to assess environmental changes in the terrestrial and aquatic ecosystem via enviromental DNA approahes. The mitogenome of Pachliopta aristolochiae is available in GenBank under the accession number MZ781228.Entities:
Keywords: Lepidoptera; Malaysia; Mitogenome; Pachliopta aristolochiae; Papilionidae
Year: 2021 PMID: 35141362 PMCID: PMC8813591 DOI: 10.1016/j.dib.2021.107740
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Mitogenome map of Pachliopta aristolochiae generated using OGDRAW [3]. The genes scattered on the heavy strand are shown on the outer side of the circle, while the inner side shows those that are scattered on the light strand. The arrows indicate the direction of gene transcription. CR represents the control region (D-loop).
Sequencing data of Pachliopta aristolochiae mitogenome.
| Pachliopta aristolochiae | |
|---|---|
| Raw reads | 10,102,746 |
| Trimmed reads | 10,102,675 |
| Ave. read length | 149.5 |
| Mapped reads | 17,890 |
| % mapped reads | 0.002 |
| Depth of coverage (X) | 175.72 |
Gene features of Pachliopta aristolochiae mitogenome.
| Position | |||||
|---|---|---|---|---|---|
| Gene (anticodon) | Start | Stop | Direction | Size | Start/Stop codon |
| trnM(cat) | 1 | 67 | F | 67 | |
| trnI(gat) | 67 | 130 | F | 64 | |
| trnQ(ttg) | 128 | 196 | R | 69 | |
| NAD2 | 231 | 1244 | F | 1014 | ATT/TAA |
| trnW(tca) | 1243 | 1307 | F | 65 | |
| trnC(gca) | 1300 | 1365 | R | 66 | |
| trnY(gta) | 1368 | 1434 | R | 67 | |
| COX1 | 1437 | 2967 | F | 1531 | CGA/TAA |
| trnL2(taa) | 2968 | 3034 | F | 67 | |
| COX2 | 3035 | 3716 | F | 682 | ATG/T |
| trnK(ctt) | 3717 | 3787 | F | 71 | |
| trnD(gtc) | 3787 | 3853 | F | 67 | |
| ATP8 | 3854 | 4021 | F | 168 | ATT/TAA |
| ATP6 | 4015 | 4692 | F | 678 | ATG/TAA |
| COX3 | 4692 | 5477 | F | 786 | ATG/TAA |
| trnG(tcc) | 5481 | 5546 | F | 66 | |
| NAD3 | 5547 | 5900 | F | 354 | ATA/TAG |
| trnA(tgc) | 5899 | 5963 | F | 65 | |
| trnR(tcg) | 5963 | 6024 | F | 62 | |
| trnN(gtt) | 6025 | 6089 | F | 65 | |
| trnS1(gct) | 6089 | 6148 | F | 60 | |
| D-loop | 6148 | 6192 | F | 45 | |
| trnE(ttc) | 6178 | 6246 | F | 69 | |
| trnF(gaa) | 6265 | 6330 | R | 66 | |
| NAD5 | 6333 | 8048 | R | 1716 | ATT/TAA |
| trnH(gtg) | 8067 | 8133 | R | 67 | |
| NAD4 | 8137 | 9472 | R | 1336 | ATG/T |
| NAD4l | 9474 | 9764 | R | 291 | ATG/TAA |
| trnT(tgt) | 9767 | 9831 | F | 65 | |
| trnP(tgg) | 9832 | 9896 | R | 65 | |
| NAD6 | 9899 | 10432 | F | 534 | ATT/TAA |
| CYTB | 10432 | 11580 | F | 1149 | ATG/TAA |
| trnS2(tga) | 11593 | 11657 | F | 65 | |
| NAD1 | 11674 | 12612 | R | 939 | ATG/TAA |
| trnL1(tag) | 12613 | 12683 | R | 71 | |
| 16S rRNA | 12659 | 13963 | R | 1280 | |
| trnV(tac) | 14021 | 14083 | R | 63 | |
| 12S rRNA | 14084 | 14802 | R | 719 | |
| D - loop | 14816 | 15235 | F | 420 | |
Base composition and AT/GC skewness for each gene region of Pachliopta aristolochiae mitogenome.
| Gene | Size (bp) | A% | G% | T% | C% | A+T% | AT skew | GC skew |
|---|---|---|---|---|---|---|---|---|
| Whole mitogenome | 15,235 | 39.3 | 7.3 | 42.3 | 11.0 | 81.6 | −0.037 | −0.202 |
| Protein coding | 11,178 | 33.5 | 10.1 | 46.8 | 9.6 | 80.3 | −0.166 | 0.025 |
| tRNA | 1,452 | 43.0 | 10.5 | 39.1 | 7.5 | 82.1 | 0.048 | 0.167 |
| rRNA | 2,024 | 43.6 | 10.4 | 40.8 | 5.2 | 84.4 | 0.033 | 0.333 |
| D-loop (major) | 365 | 46.3 | 1.6 | 49.6 | 2.5 | 95.9 | −0.034 | −0.220 |
| D-loop (minor) | 45 | 46.7 | 2.2 | 51.1 | 0.0 | 97.8 | −0.045 | 1.000 |
Fig. 2Features of the two D-loop regions of Pachliopta aristolochiae mitogenome located between trnS1 and trnE, as well as 12S rRNA and trnM. Conserved motifs ‘ATAGA’ and ‘ATTTA’ are indicated in red and blue respectively. Poly-T stretch is indicated in green while microsatellite-like elements (TA)n and (AT)n are shown in yellow.
Lepidoptera mitogenomes used to build the phylogenetic tree analysis. The sequenced P.aristolochiae in this data is indicated by (*), with GenBank Accession No. MZ781228.
| Family | Subfamily | Species | GenBank Accession No. |
|---|---|---|---|
| Papilionidae | Papilioninae | NC 053770 | |
| Papilionidae | Parnassiinae | NC 047306 | |
| Papilionidae | Papilioninae | NC 043911 | |
| Papilionidae | Parnassiinae | NC 041148 | |
| Papilionidae | Papilioninae | NC 034280 | |
| Papilionidae | Papilioninae | NC 034317 | |
| Papilionidae | Papilioninae | NC 034355 | |
| Papilionidae | Papilioninae | NC 034356 | |
| Papilionidae | Papilioninae | NC 034837 | |
| Papilionidae | Papilioninae | NC 025757 | |
| Papilionidae | Papilioninae | NC 037862 | |
| Papilionidae | Parnassiinae | NC 037863 | |
| Papilionidae | Papilioninae | NC 037867 | |
| Papilionidae | Papilioninae | NC 037868 | |
| Papilionidae | Papilioninae | NC 037869 | |
| Papilionidae | Papilioninae | NC 037870 | |
| Papilionidae | Papilioninae | NC 037871 | |
| Papilionidae | Papilioninae | NC 037874 | |
| Papilionidae | Papilioninae | NC 037875 | |
| Papilionidae | Papilioninae | MZ781228 | |
| Lycaenidae | Polyommatinae | NC 058607 | |
| Lycaenidae | Polyommatinae | NC 029763 |
Fig. 3Phylogenetic tree of Pachliopta aristolochiae (MZ781228), indicated by asterisk (*) and 21 other Lepidoptera mitogenomes built using Maximum-Likelihood (ML) and Bayesian Inference (BI) approach. Bootstrap support values were indicated on each tree node, showing the results of ML and BI analysis. Caerulea coeligena (NC 058607) and Shijimiaeoides divina (NC 029763) from the family Lycaenidae were used as outgroups.
| Subject | Genomics |
| Specific subject area | Lepidoptera, Papilionidae, Mitogenomics |
| Type of data | Fasta: Mitogenome sequence data Tables: Sequencing data, gene features, base composition, list of Lepidoptera mitogenomes used for phylogenetic analyses Figures: Circular mitogenome map, features of the D-loop regions, phylogenetic tree analysis |
| How the data were acquired | Whole genome shotgun sequencing using Illumina NovaSeq 6000 platform with 150 paired-end mode (PE150) |
| Data format | Raw and analyzed |
| Parameters for data collection | Genomic DNA was extracted from fresh tissue sample of |
| Description of data collection | The assembly was done using NOVOPlasty v.4.2 and run through a PALEOMIX BAM pipeline to assess the mitogenome mapping. Annotation was done using the MITOS v2 web server and the predicted protein-coding genes were further verified using the Open Reading Frame (ORF) Finder. The circular mitogenome map was generated using OGDRAW. PhyloSuite v1.2.2 was used to extract, align and concatenate 13 protein-coding genes from 22 Lepidoptera mitogenomes prior to phylogenetic analysis. IQ-Tree and MrBayes v3.2.7 programs were used to build the phylogenetic trees using Maximum-Likelihood (ML) and Bayesian Inference (BI) probability method. PartitionFinder v2.2.1 was used to set the best partitioning schemes for the dataset. The resulting phylogenetic trees were visualized using Figtree v1.4.4. |
| Data source location | The sample |
| Data accessibility | Repository name: NCBI BioProject |