Literature DB >> 34887433

Complete mitochondrial genome sequencing of Oxycarenus laetus (Hemiptera: Lygaeidae) from two geographically distinct regions of India.

Shruthi Chalil Sureshan1, Ruchi Vivekanand Tanavade1, Sewali Ghosh2, Saswati Ghosh3, Raja Natesan Sella4, Habeeb Shaik Mohideen5.   

Abstract

Oxycarenus laetus is a seed-sap sucking pest affecting a variety of crops, including cotton plants. Rising incidence and pesticide resistance by O. laetus have been reported from India and neighbouring countries. In this study, O. laetus samples were collected from Bhatinda and Coimbatore (India). Pure mtDNA was isolated and sequenced using Illumina MiSeq. Both the samples were found to be identical species (99.9%), and the complete genome was circular (15,672 bp), consisting of 13 PCGs, 2 rRNA, 23 tRNA genes, and a 962 bp control region. The mitogenome is 74.1% AT-rich, 0.11 AT, and - 0.19 GC skewed. All the genes had ATN as the start codon except cox1 (TTG), and an additional trnT was predicted. Nearly all tRNAs folded into the clover-leaf structure, except trnS1 and trnV. The intergenic space between trnH and nad4, considered as a synapomorphy of Lygaeoidea, was displaced. Two 5 bp motifs AATGA and ACCTA, two tandem repeats, and a few microsatellite sequences, were also found. The phylogenetic tree was constructed using 36 mitogenomes from 7 super-families of Hemiptera by employing rigorous bootstrapping and ML. Ours is the first study to sequence the complete mitogenome of O. laetus or any Oxycarenus species. The findings from this study would further help in the evolutionary studies of Lygaeidae.
© 2021. The Author(s).

Entities:  

Mesh:

Year:  2021        PMID: 34887433      PMCID: PMC8660866          DOI: 10.1038/s41598-021-02881-0

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Polyphagous seed feeding dusky cotton bug Oxycarenus laetus infects various crops, especially cotton plants and the Malvaceae family[1]. Its incidence is on the rise, and it is reported from 100 + countries[2,3]. The life duration of this insect lasts from 35 to 45 days through five nymphal stages to develop into a fully grown adult[4]. Reports suggest the incidence of secondary pests is steeply increasing on cotton, especially the Oxycarenus spp.[5], including on the Bt varieties. The trend is similar in nearby China and Pakistan[6]; it has become resistant to several pesticides, and synergistic actions were required to control it[7]. Mitochondrial DNA has been a very significant marker in population and evolutionary studies. Uniparental transmission to offspring, high copy number per cell, and susceptibility to rapid evolution render mtDNA highly variable[8]. Though resistance to arthropods due to mtDNA is yet to be ascertained, several studies have reported cryptic species as one of the reasons; arising due to mutations either in the mtDNA or nuclear DNA. Uniparental maternal inheritance involving mtDNA has been reported as one of the reasons for the difference between susceptible and resistant species[9]. Recently, Ding et al.[10], by employing mitochondrial genome and transcriptome sequencing, identified the role of mitochondrial genes atp8 and nad5 and the associated amino acid mutations to be rendering resistance to pyrethroid by A. sinensis population. Due to its conserved nature and critical genetic markers, mtDNA has become the primary source for phylogenomics[11]. Identification of insects is crucial to managing endangered, protected, and invasive species in the ecosystem[12]. Insect mitogenome size ranges from 14,503 to 19,517 bp[13,14]. Most insects code for typical 37 genes, including 13 protein coding genes (PCGs), 22 tRNA genes and 2 rRNA genes, and a control region[15]. The most prominent non-coding region is the control region, also called the A + T rich region, responsible for transcription and translation. It is assumed to be analogous to the oriC[16]. Despite the technological advancements in biotechnology, there are only 9046 insect mitogenomes, of which 1016 are from the order Hemiptera and only four from its Lygaeidae family (as of 26/06/2020) which provides much scope in insect mitogenomic studies. It has been found that the North Indian fields had a higher incidence and resistance to pesticides by this bug in comparison to the South Indian cotton fields[7,17,18]. It could be due to the existence of different/cryptic species altogether or due to mutations in the detoxification system. In this study, the mitogenome of O. laetus (Bhatinda, N. India (BTI) and Coimbatore, S. India (CBE)) was wholly sequenced and annotated, making it the first such study in the genus Oxycarenus. Samples are collected from two locations that differ drastically in climate, soil conditions, and geography. This study was done to understand if two different species inhabit these regions. Studying the mitogenome of this insect could open up new avenues in Hemiptera research and its phylogenomics and better pest control.

Materials and methods

Sample collection

Samples of O. laetus were collected from the post harvested cotton fields located in Punjab Agriculture University, BTI (Punjab), and Tamilnadu Agriculture University, CBE (Tamilnadu) in India, with the help of scientists (species confirmation) and staff. Cotton bolls carrying O. laetus were plucked and transferred to transparent plastic boxes with their mouths covered with muslin clothe and transported to the laboratory. They were reared until an ample population was raised to extract mtDNA. Insects were reared on cotton bolls and maintained at 16:8 h of light:dark photoperiod, 27 °C, and 55% humidity. All the experiments were performed on the F1 generation population. All the chemicals and reagents used in this study were of finite grade.

Extraction and validation of mitochondrial DNA

A mixed population of adult O. laetus weighing 200 mg from both locations was macerated in 10 ml of homogenization buffer (0.4 M sucrose, 50 mM Tris HCl pH 7.5, 50 mM EDTA, 1% BSA, 10 mM β-mercaptoethanol)[19]. The mixture was centrifuged at 3500 rpm for 5 min at 4 °C. The supernatant was collected and centrifuged twice to remove cell debris and nuclear DNA. The supernatant was then centrifuged at 14,000 rpm for 15 min to pellet the mitochondria and resuspended in DNase-RNase free water. Traces of nuclear DNA were removed by incubating the mixture in 2.5 U of DNase I and DNase buffer (Qiagen) for 30 min. The action of DNase was terminated by adding 30 μl of 0.5 M EDTA. The mixture was again centrifuged at 12,000 rpm for 5 min, and the pellet was resuspended in 500 µl lysis buffer (100 mM Tris pH 8, 10 mM EDTA, 2% SDS, 15 μl of 20 mg/ml proteinase K), and the mixture was incubated at 55 °C for 1 h. Then, 140 μl of 5 M NaCl and 250 μl of 2% CTAB were added to the mixture and incubated at 62 °C for 25 min. RNA contamination was removed by adding 10 μl of 20 mg/mL RNase A and incubation at 37 °C for one hour. An equal volume of chloroform:isoamyl alcohol (24:1) was added to the mixture and centrifuged at 14,000 rpm for 15 min. The top aqueous phase was collected, and the step was repeated. The final aqueous phase was collected, and two volumes of isopropanol was added and incubated at 20 °C for one hour. The mtDNA was precipitated by centrifugation at 14,000 rpm for 15 min and washed with 70% ethanol. The air-dried pellet was then dissolved in 20–30 μl of TE buffer and quantitated using NanoDrop. The purity of the isolated mitochondrial DNA was assessed by amplifying COI (mitochondrial genome-specific) and Histone 4 gene (nuclear genome-specific). Table 1 shows the primers used in this study. The following conditions were used to amplify the COI gene: initial denaturation at 94 °C for 5 min; 30 cycles of final denaturation at 94 °C for 1 min, annealing at 52 °C for 1 min, elongation at 72 °C for 2 min 30 s; and final elongation at 72 °C for 5 min. Conditions used for amplifying Histone 4 gene are as follows: initial denaturation at 94 °C for 3 min; followed by 30 cycles of final denaturation at 94 °C for 1 min, annealing at 52 °C for 1 min, elongation at 72 °C for 1 min; and final elongation for 5 min. PCR products were visualized on 1% agarose gel.
Table 1

Primers used in the study.

NameSequence
COI-forwardGGTCAACAAATCATAAAGATATTGG
COI-reverseTAAACTTCAGGGTGACCAAAAAATCA
Histone4-forwardATTTCCACTCTGGTGGATAAGC
Histone4-reverseACACTTGGGCCTTTTAACTTTG
tRNH-NADH5-gap-forwardTTTCCCAATCAACAAAATAAAAA
tRNH-NADH5-gap-reverseTTGGGTCATTCTTTTTCAGG
Primers used in the study.

Mitogenome sequencing and assembly

Nextera XT DNA Library preparation protocol was followed for library preparation using 1 ng of fragmented and tagged quantified DNA. The DNA was tagged using Amplicon Tagment Mix present in the Nextera XT kit. The DNA was then subjected to 12 cycles of indexing-PCR, and the product was purified by high-prep magnetic beads and quantified. The DNA samples were sequenced De novo using MiSeq Illumina sequencer to generate paired-end reads of length 75 bp. Raw data generated in this study were submitted to NCBI’s SRA Archives (PRJNA520830). The raw reads were proofread using FastQC[20], and the low-quality reads and adapter sequences were filtered using Trimmomatic (0.38 v)[21] and Cutadapt (2.8v)[22]. The high-quality reads were assembled using NOVOPlasty[23].

Sequence analysis, annotation, and visualization

MITOS-1 and MITOS-2[24] were used to annotate the assembled mitogenome. The PCGs were confirmed by using ORF Finder, and BlastN confirmed rRNA and tRNAs. The control region was validated by aligning it with the transcriptome of data generated in our previous study (SRX5329347) using standalone BLAST. MEGA 7.0[25] was used to determine nucleotide composition, codon usage, and relative synonymous codon usage (RSCU). Skewness in the mitogenome was calculated using the formula: AT-skew = (A − T)/(A + T); GC-skew = (G − C)/(G + C)[26]. Tandem repeats in the mitogenome were identified using Tandem Repeats Finder[27]. Secondary structures of tRNA were predicted using Mfold[28]. The CGView Server[29] was used to generate the circular map of the mitogenome.

Cross verification of the rearrangement of intergenic region

The rearrangement of the conserved intergenic region of Lygaeidae was verified by designing primers to target and amplify the specific spacer. The amplified product was sequenced using the Sanger method and aligned with the assembled mitogenome. Table 1 lists the primer pair used for verification. PCR conditions used were as follows: initial denaturation at 94 °C for 3 min; 30 cycles of denaturation at 94 °C for 1 min, annealing at 57 °C for 1 min, elongation at 72 °C for 1 min; and final elongation at 72 °C for 5 min (Fig. S1).

Phylogenetic analysis

The phylogenetic study was done using 35 other species from hemipteran super-families. These included Aphidoidea, Cimicoidea, Reduvioidea, Pentatomoidea, Coreoidea, Pyrrhocoroidea, and Lygaeoidea (Table 4). Similar methodology as was followed by[30-32] was employed with few changes as was needed for this study. The significant difference here was that we used whole mitogenomes instead of PCGs, intending to study the relationship among the selected data as a function of complete mitogenome. Both PhyML online server[33] and raxMLGUI[34] were used to construct the phylogenetic tree. The mitogenomes were aligned using MUSCLE, conserved regions were identified, and the unreliable regions in the data were removed using the Gblocks program. In the raxmlGUI, ML + bootstrapping analysis was done involving 1000 replications invoking the GTRGAMMAI substitution model. The resulting tree was visualized using TreeDyn. We also used the Bayesian Inference method to generate a phylogenetic tree in MrBayes v3.2.7 in order to validate the tree[35]. Markov chain Monte Carlo (MCMC) simulations were run involving 1,000,000 generations with sampling every 1000 generations and the process was continued until the average standard deviation of split frequencies < 0.01. The first 25% sample trees were discarded in the burn-in process and the remaining trees were used to form the consensus tree and posterior probability (PP) values.
Table 4

Codon count and relative synonymous codon usage in O. laetus mitochondrial PCGs.

CodonCountRSCUCodonCountRSCUCodonCountRSCUCodonCountRSCU
UUU(F)203− 1.5UCU(S)70− 1.25UAU(Y)224− 1.53UGU(C)56− 1.51
UUC(F)68− 0.5UCC(S)42− 0.75UAC(Y)68− 0.47UGC(C)18− 0.49
UUA(L)310− 3.22UCA(S)93− 1.66UAA(*)248− 1.56UGA(W)86− 1.51
UUG(L)44− 0.46UCG(S)11− 0.2UAG(*)70− 0.44UGG(W)28− 0.49
CUU(L)81− 0.84CCU(P)87− 1.43CAU(H)71− 1.3CGU(R)13− 1.13
CUC(L)23− 0.24CCC(P)63− 1.03CAC(H)38− 0.7CGC(R)8− 0.7
CUA(L)99− 1.03CCA(P)80− 1.31CAA(Q)139− 1.54CGA(R)19− 1.65
CUG(L)21− 0.22CCG(P)14− 0.23CAG(Q)41− 0.46CGG(R)6− 0.52
AUU(I)311− 1.48ACU(T)94− 1.32AAU(N)308− 1.45AGU(S)58− 1.04
AUC(I)110− 0.52ACC(T)62− 0.87AAC(N)118− 0.55AGC(S)43− 0.77
AUA(M)289− 1.71ACA(T)114− 1.6AAA(K)391− 1.53AGA(S)90− 1.61
AUG(M)49− 0.29ACG(T)15− 0.21AAG(K)120− 0.47AGG(S)41− 0.73
GUU(V)49− 1.2GCU(A)42− 1.57GAU(D)55− 1.43GGU(G)27− 1.13
GUC(V)23− 0.56GCC(A)24− 0.9GAC(D)22− 0.57GGC(G)7− 0.29
GUA(V)80− 1.96GCA(A)37− 1.38GAA(E)98− 1.65GGA(G)46− 1.92
GUG(V)11− 0.27GCG(A)4− 0.15GAG(E)21− 0.35GGG(G)16− 0.67

Results

Genome organization and structure of O. laetus

The MiSeq data of BTI and CBE samples of O. laetus was submitted to the NCBI SRA database under the accession numbers SRR11024516 and SRR11024517, respectively. The study was done to understand whether there exists a different species of the genus Oxycarenus similar to O. laetus, which might be the reason for higher incidence and resistance. The study results are such that the mitogenomes of both the samples were highly identical (99.9%), confirming both to be identical species. Therefore, to maintain clarity of reading, findings of only one sample will be discussed hereafter, unless mentioned elsewhere in the manuscript. The mitogenome of O. laetus had all the general characteristics of other similar insects. The 15,672 bp long mitogenome is double-stranded and circular, consisting of typical 13 PCGs, 2 rRNA, and 23 tRNA genes. A unique feature witnessed was 38 genes instead of the typical 37 genes and a control region spanning 962 bp. The majority J-strand had 24, and the minority N-strand harbours 14 of these 38 genes. Gene overlaps were observed at 16 locations, and the longest overlap involved 25 bp between trnL1 and rRNAL[36]. Two overlaps found between atp8-atp6 and nad4-nad4l had a 7 bp ATGATAA pattern. Overall, ten intergenic/non-coding regions are present, ranging in length from 1 to 200 bp (Table 2, Fig. 1).
Table 2

Mitochondrial genome organization of O. laetus.

NameStartStrandSize (bp)AnticodonStart codonStop codonIntergenic nucleotides
trnI1–62J62GAT0
trnQ60–128N69TTG− 1
trnM128–195J68CAT0
nad2196–1185J990ATTTAA− 2
trnW1184–1246J63TCA− 8
trnC1239–1300N62GCA0
trnY1301–1362N62GTA1
cox11364–2902J1539TTGTAA− 5
trnL22898–2962J65TAA0
cox22963–3637J675ATAAAA1
trnK3639–3709J71AAG0
trnD3710–3775J66GTC0
atp83776–3934J159ATTTAA− 7
atp63928–4593J666ATGTAA− 1
cox34593–5379J787ATGT0
trnG5380–5445J66GGA0
nad35446–5799J354ATATAA0
trnA5800–5862J63TGC0
trnR5863–5925J63TCG0
trnN5926–5993J68GTT− 1
trnS15993–6061J69GCT− 1
trnE6061–6125J65TTC0
trnF6126–6188N63GAA− 1
nad56188–7861N1674ATTTAA41
trnH7901–7962N62GTG2
nad47965–9284N1320ATGTAA− 7
nad4l9278–9559N282ATTTAA2
trnT_09562–9623J62TGT− 1
trnT_19823–9884J62TGT200
trnP9885–9948N64TGG1
nad69950–10,417J468ATTTAA− 1
cob10,417–11,553J1137ATGTAG− 2
trnS211,552–11,620J69TGA17
nad111,637–12,566N930ATATAA− 6
trnL112,561–12,626N66TAG− 25
rrnL12,602–13,845N124430
trnV13,876–13,942N67TAC1
rrnS13,944–14,710N767− 1
Control region14,710–15,672962

J majority strand, N minority strand, − ve overlapping genes, + ve intergenic space.

Figure 1

Circular map of the mitogenome of O. laetus. The above figure tells us the location of PCGs, tRNAs, rRNAs, and control region. The inner green, purple and black circles show the -ve GC skew, + ve GC skew, and GC content.

Mitochondrial genome organization of O. laetus. J majority strand, N minority strand, − ve overlapping genes, + ve intergenic space. Circular map of the mitogenome of O. laetus. The above figure tells us the location of PCGs, tRNAs, rRNAs, and control region. The inner green, purple and black circles show the -ve GC skew, + ve GC skew, and GC content. The gene arrangement was similar to the ancestral arrangement with a significant change in the intergenic space between nad4 and tRNA-H, which is reduced to 11 bp. Another striking feature was the duplication of the gene tRNA-threonine, which could have occurred due to anticodon mutation or replication slipping[37]. The displacement of tRNA-H, closer to nad4 and shortening of the 41 bp non-coding region to 11 bp was validated by aligning the concerned region of the assembled data with the Sanger sequencing, and 100% identity was obtained, confirming the displacement as shown in Fig. S2. Lygaeidae have a unique synapomorphy by having an unusual spacer between tRNA-His and nad4 in their mitogenomes[38]. Though this was witnessed in O. laetus mitogenome, perhaps on the other end, a feature that is generally expected to be in Lygaeidae.

Nucleotide composition and codon usage

Nucleotide composition is highly skewed towards A/T, a common feature in mitogenomes[30,31], and the AT content was 74.22%, and the rest being GC content (25.6%) (Table 3). The standard range for Hemiptera ranges between 69.53–83.53% and 73.15–76.04% in the case of Lygaeoidae (Table 4). The AT skew observed was 0.11, indicating adenines more in number than thymines, and the GC skew of − 0.19 indicates cytosines outnumbering guanines by a whisker.
Table 3

Comparison of skewness in other insect species.

NCBI Acc.#SpeciesFamilyAGCTTotalAT%GC%AT skewGC skew
NC_015842.1Agriosphodrus dohrniReduviidae642220072569547216,47072.2227.780.08− 0.12
NC_011594.1Acyrthosiphon pisumAphididae78739261665650516,96984.7315.270.10− 0.29
NC_012446.1Aeschyntelus notatusRhopalidae622514682062477714,53275.7124.290.13− 0.17
MT430940.1Aphis gossypiiAphididae72879241676615816,04583.8016.200.08− 0.29
HQ902161.1Apolygus lucorumMiridae635815131915498214,76876.7923.210.12− 0.12
JX839706.1Chauliops fallaxMalcidae704615102630455315,73973.7026.300.21− 0.27
NC_012449.1Coptosoma bifariaPlataspidae661620182621492416,17971.3328.670.15− 0.13
JQ739179.1Coridius chinensisDinidoridae624915892068474214,64875.0324.970.14− 0.13
NC_020373.1Dolycoris baccarumPentatomidae692218672542521816,54973.3626.640.14− 0.15
NC_012421.1Dysdercus cingulatusPyrrhocoridae716114132212546316,24977.6922.310.13− 0.22
NC_042436.1Euscopus rufipesPyrrhocoridae700514982475537716,35575.7124.290.13− 0.25
NC_022449.1Eusthenes cupreusTessaratomidae681618822316521516,22974.1325.870.13− 0.10
NC_012424.1Geocoris pallidipennisGeocoridae632115022021474814,59275.8624.140.14− 0.15
NC_013272.1Halyomorpha halysPentatomidae711116642245549716,51776.3323.670.13− 0.15
NC_012456.1Hydaropsis longirostrisCoreidae687115142541559516,52175.4624.540.10− 0.25
KJ584365.1Kleidocerys resedaeLygaeidae664113992120452814,68876.0423.960.19− 0.20
MF497725.1Lygaeus sp.Lygaeidae656314632208500115,23575.9024.100.14− 0.20
EU401991.2Lygus lineolarisMiridae731616702410563117,02776.0423.960.13− 0.18
NC_012457.1Macroscytus gibbulusCydnidae607616132219471214,62073.7926.210.13− 0.16
NC_012458.1Malcus inconspicuousMalcidae689513932065522215,57577.8022.200.14− 0.19
NC_015342.1Megacopta cribrariaPlataspidae661218892737440915,64770.4429.560.20− 0.18
KX505856.1Neolethaeus assamensisRhyparochromidae669215672478433015,06773.1526.850.21− 0.23
JQ806057.1Nesidiocoris tenuisMiridae725319432442590417,54275.0025.000.10− 0.11
NC_011755.1Nezara viridulaPentatomidae730116612244568316,88976.8823.120.12− 0.15
MN599979.1Nysius plebeiusLygaeidae757416562340579717,36776.9923.010.13− 0.17
SRR852651xOxycarenus laetusLygaeidae647216362393516415,66574.2825.720.11− 0.19
KX216853.1Panaorus albomaculatusRhyparochromidae725815552364516816,34576.0223.980.17− 0.21
NC_012460.1Phaenacantha marcidaColobathristidae644215632296423914,54073.4626.540.21− 0.19
NC_012432.1Physopelta guttaLargidae671015102297441814,93574.5125.490.21− 0.21
NC_046740.1Rhopalosiphum nymphaeaeAphididae70109081543613215,59384.2815.720.07− 0.26
NC_012462.1Riptortus pedestrisAlydidae719816242400596917,19176.5923.410.09− 0.19
NC_012888.1Stictopleurus subviridisRhopalidae651215582163508615,31975.7124.290.12− 0.16
NC_002609.1Triatoma dimidiataReduviidae691319013275492017,00969.5730.430.17− 0.27
NC_020144.1Urochela quadrinotataUrostylididae668516332442582716,58775.4324.570.07− 0.20
NC_012823.1Valentia hoffmanniReduviidae646716152488505515,62573.7426.260.12− 0.21
NC_012464.1Yemmalysus parallelusBerytidae662916201974552415,74777.1822.820.09− 0.10
Comparison of skewness in other insect species. Codon count and relative synonymous codon usage in O. laetus mitochondrial PCGs. A similar trend was observed with PCGs as well. The PCGs in the J strand have AT skew of − 0.025 and a GC skew of − 0.130. At the same time, PCGs in the N strand have AT skew of − 0.35 and a GC skew of 0.244. They are indicating that J-strand PCGs are much less AT and GC skewed than N-strand genes. The rRNA genes have a GC content of 25.7% and 22.6% for rrnS and rrnL, respectively. G. pallidipennis is found to closely match the statistics obtained for the O. laetus when the comparison is made among the Lygaeoidae (Table 3). After analyzing 100 + genomes, Wei et al., reported GC skewness as an indication of oriC inversion associated with replication orientation and AT skew dictating gene direction and codon usage, both related to the mitogenome strand asymmetry[39]. A slight variation in the average codons used among our samples was seen and found 14 bp less in BTI. However, a more extensive analysis revealed that the codon AUU (isoleucine) = 311.0 and GCG (Alanine) = 4.0 were having the highest and lowest frequencies in both the samples (Table 7). Predominant codons like Ile, Met, Phe, and Leu were entirely composed of A and T nucleotides. The nucleotide composition of each codon observed for the entire genome shows that the possibility of A occurring at the first position is 42.4, the second position is 39.0, and the third position is 42.6. It was 33.0, 34.0, and 42.6; 10.4, 10.8, and 9.8; and 15.3, 15.4, and 14.1 for T, G, and C nucleotides, respectively.
Table 7

Tandem repeats in the mitogenome of O. laetus.

IndicesPeriod sizeCopy numberConsensus sizePercent matchesPercent indelsScoreACGTEntropy (0–2)Consensus seq
5756–579922222918725429341.44TATATTAGAATGAACTAAATAA
13,503–13,564223207911705410431.09TAATTATAAATTAAATTTAA
The relative synonymous codon usage (RSCU) for each amino acid was calculated (Table 4), and a graph was manually generated. It is the number of times a codon is repeated in relation to the uniform synonymous codon usage, i.e., all codons of amino acid have the same chances of appearing in the sequence. It can be observed from Fig. 2 that all the codons of all amino acids have been strictly used, but the distribution is not equal. This distribution can act as markers for identification at the species level and intra-ordinal level.
Figure 2

Relative synonymous codon usage.

Relative synonymous codon usage.

Protein coding genes

All the identified 13 PCGs were confirmed for length and sequence by comparing them with the Illumina HiSeq RNA-Seq data generated by our lab for another study (SRR8526511 and SRR8526512). The results obtained from the MITOS2 server agreed with the cross-comparison and verification with the transcriptomic data (results not shown). As shown in Table 2 above, twelve genes have ATN as a start codon, only cox1 has TTG, which can be seen as a start codon for cox1 in many other species[40,41]. Four PCGs (nad2, cox2, nad3, nad1) have ATA as their start codon, four other genes (atp8, nad5, nad4l, and nad6) have ATT as their start codon. The genes atp6, cox3, nad4, cob start with codon ATG. Almost all the genes end with the typical non-sense codon, but few genes have incomplete/different termination codons like cox2 ending with AAA, cox3 ending with an incomplete codon: T-. The majority of the genes are found on the J-strand with much lower AT and GC Skew. Each PCG was aligned and compared with 35 other species to confirm consensus sequences and significant motifs. Further, from Table 5, we can infer that there was not a vast difference between AT and CG richness between rRNA, tRNA, and most PCGs. However, the cytochrome family of PCGs cox1, cox2, cox3, and cob had a higher representation of GC than the other components of the mitogenome. These genes had an average of 30% GC content which was much lower, 17–26% for other PCGs. The pattern followed by the cytochrome family of genes was also evident for the control region. It has been known that the cytochrome family of genes, primarily, cox1 and cob, are used as markers for eukaryotic species identification and evolutionary studies. The higher incidence of GC in these PCGs may well be attributed to being less prone to environmental mutations, codon usage, and therefore, species identity is maintained[42].
Table 5

Nucleotide composition and skewness of the mitogenome of O. laetus.

O. laetusTotal size (bp)ATGCGC%AT%AT SkewGC Skew
mtgenome15,672647151611633238925.66474.2220.117− 0.188
nad29903814079111120.40479.596− 0.033− 0.099
cox1153949255322227132.03467.901− 0.058− 0.099
cox26752492238611730.07469.9260.055− 0.153
atp81597154112321.38478.6160.136− 0.353
atp66662442457110626.57773.423− 0.002− 0.198
cox378728527010612629.47970.5210.027− 0.086
nad3354136129365325.14174.8590.026− 0.191
nad5167443085723415223.05976.882− 0.3320.212
nad4132031568020212224.54575.379− 0.3670.247
nad4l28264153471823.05076.950− 0.4100.446
nad6468173214344717.30882.692− 0.106− 0.160
cob113737643814118228.40871.592− 0.076− 0.127
nad19302324681428824.73175.269− 0.3370.235
tRNA genes149757554721515924.98374.9500.0240.150
rRNA genes201165987330117823.81976.181− 0.1400.257
Control region9623283369320230.66569.023− 0.012− 0.369
Nucleotide composition and skewness of the mitogenome of O. laetus.

Non-coding regions

The control region is the largest functional non-coding entity in the mitogenomes, with a length of 962 bp. There are four major spacers spread throughout the genome, ranging from 17 to 200 bp. Small gaps of 1–3 bp were also observed between the genes. The important rearrangement region of 41 bp is found between nad5 and trnH. The second-largest gap measuring 200 bp in length is found between trnT replicates (trnT_0 and trnT_1), other small intergenic spaces were found between trnS2 and nad1, rrnL and trnV with lengths of 17 and 25 bp, respectively. The spacer between trnS and nad1 is common in most insects. The conserved sequence block (CSB) region has two consensus regions, indicating that tandem duplication and random loss (TDRL) process could be a common intermediate step[37,43]. Two consensus motifs are 5 bp long AATGA and ACCTA sequences. The control region is the major non-coding spacer, considered the box of origin sites for replication and transcription[44]. The largest crucial non-coding region is also helpful in species identification. Due to its high divergence, the evolution of tandem repeats, and variability, it can be used as markers, making it a potential species-specific marker[11]. Studies about the control region in Echinoides have shown that due to the region's unique features, it has high compatibility across the entire class, outperforming conventional mitochondrial markers[45]. Of all the arthropods sequenced, a few prevalent motifs were noted: (1) Tandem/microsatellite repeats (Tables 6, 7), (2) Poly-thymine sequence, (3) a AT-rich region, (4) A stem-loop structure[15]. We identified all these motifs in our mitogenomes. Some additional striking features were also seen in other insects. Two types of tandem repeats were identified with a consensus size of 22 copy number 2 and another with size 20 and copy number 3 spread across the genome except in the control region.
Table 6

Microsatellites in the control region.

PositionLengthRepeatsSequence
14,75423CCCCCC
14,84823TATATA
14,85825TATATATATA
14,96343ATTTATTTATTT
15,19623CCCCCC
15,25923CACACA
15,35323AAAAAA
Microsatellites in the control region. Tandem repeats in the mitogenome of O. laetus.

tRNAs and rRNAs

As mentioned earlier, we identified 23 tRNAs in the O. laetus mitogenome rather than the regular 22, lengths ranging from 60 to 72 bp. The structures predicted were clover-leaf or isoforms with a few exceptions: trnS1(AGN), in which the DHU arm (dihydrouridine) was reduced to just a loop of 6 bp, which is the case for most metazoans (Fig. S3)[46]. The trnV had formed two loops in both the DHU and TѱC arms. A duplicate of trnT_0 was identified as trnT_1, which has a similar structure to trnS1 and resembled its isoform due to the reduction of the TѱC arm to a loop, though the sequence was highly similar to trnT_0. These features proved that this novel gene order could be explained by the tandem duplication/random loss (TDRL) model[37]. The wobble and mismatch pairs common in insects' tRNA were corrected by comparing data from two software MITOS2[24] and tRNAScan-SE[47]. Keeping with the trend of mitogenome patterns, two rRNA genes were identified and further aligned with 35 other species to confirm the lengths and motifs. The large subunit gene of rRNA was 1244 bp in length and positioned between trnV and trnL-1. The GC content was about 22.6%, and the gene is AT-rich. The smaller subunit gene of rRNA was 767 bp in length with a slightly higher GC content of 25.8%. We used only one O. laetus (CBE) mitogenome to maintain equality with the 35 other species' mitogenomes and avoid any skewness/influence of our mitogenome on the final tree. Seven different super-families chosen in this study belonged to Hemiptera (Aphidoidea, Cimicoidea, Reduvioidea, Pentatomoidea, Coreoidea, Pyrrhocoroidea, and Lygaeoidea). Hemiptera is one of the largest and diverse groups of insects which has more than 110 genera. GAMMA + P-Invar model parameter estimates were based on the 13,883 alignment patterns with the help of the GTR substitution matrix, and the overall tree length is 16.016. In the phylogenetic tree shown in Fig. 3 and Fig. S4 (Bayesian inference method), it can be seen that all the species are well placed under their respective super-families. All the 11 Lygaeoidea super-family species have found themselves accommodated in a polyphyletic clade, including species from the Lygaeidae to which O. laetus belongs. Studies have confirmed that Lygaeidae is polyphyletic[48], meaning evolution from several common ancestors. Incidentally, scientists have differed over the years on the lineage of Lygaeidae. They have concluded the need for further studies to reach a consensus on this, as the super-family Lygaeoidea consists of several sub-families within itself.
Figure 3

Phylogenetic relationship inferred among 36 whole mitochondrial genomes from seven different super-families of Hemiptera. The above tree shows O. laetus and other member species are clustered under their respective super-families.

Phylogenetic relationship inferred among 36 whole mitochondrial genomes from seven different super-families of Hemiptera. The above tree shows O. laetus and other member species are clustered under their respective super-families. O. laetus remains a separate member among the selected Lygaeoidea species closely related to Y. parallelus (Berytidae), resulting in Lygaeus Sp., and N. plebeius in one clade and M. inconspicuous and K. resedae resedae in the other. The rigorous phylogenetic analysis supported by higher branch confirmation values only confirms the clustering and relationships already documented and established. In our earlier study[49], we had identified the species and studied its phylogenetic relationship with other species using COI sequences. Also, the clustering of O. laetus in the Lygaeoidea only confirms this study's sequencing, assembly, and annotation accuracy. However, there always will be opportunities for betterment. Since there are no similar prior mitogenomic studies in Lygaeoidea/Lygaeidae, we could not compare our results. Nevertheless, results obtained from this study will make a significant contribution to future studies involving the Lygaeidae family under the Lygaeoidea super-family.

Discussion

The species O. laetus, although a secondary pest of the cotton plant, is a severe threat to the overall quality and quantity of cotton produce. The insect is widely distributed in various parts of Asia irrespective of climate and soil conditions. The emphasis of this study was to sequence and annotate the complete mitogenome to understand if the two species from these locations are different from each other that might be rendering higher pesticide resistance in BTI (N. India). As both the BTI and CBE samples essentially have identical mitogenomes, the species are the same. Hence, the probability of mutation in the detox system could be assumed as the reason for its survival under such relatively high pesticide and harsh conditions, which the gene expression studies can establish. The mtDNA is responsible for various essential physiological functions, primarily oxidative phosphorylation[50]. Genes such as NAD and Cytochrome oxidase play vital roles in the ETC pathway[51]. Although the gut microbes are primarily responsible for digestion of harmful pesticides, it works in association with the gut cells for the ultimate goal of respiration[52]. Therefore, the mitochondrial genes are indirectly involved in the digestion of pesticides and insecticides. Targeting these genes and other pathways, such as P450, GST, ABC, and others, can help eradicate pests without reliance on harmful chemicals[53,54]. There has been massive research on mtDNA of insects due to its smaller size, comparative and phylogenetic analysis being the majority[55]. The order Hemiptera also has such comparative and phylogenetic studies[56]. This provides data on intra-ordinal and inter-ordinal relations amongst different insect species. The Lygaeidae family under this order comprises many important pests and species, yet only four have completely sequenced and analysed mitogenomes[57]. This type of analysis provides a perspective for other research and even helps understand evolution to some extent. In this study, the mitochondrial genome of O. laetus has been sequenced, thoroughly analysed, annotated, and compared with 35 other species from Hemiptera. The gene order is similar to the ancestral arrangement except for a few rearrangements. The genome organization was similar to almost all Hemipterans. Unlike the typical 37 genes, 38 genes were determined; this included 13 PCGs, 23 tRNAs, and 2 rRNAs. The gene order was compared with other species from other genera also. There were 16 overlaps observed between the genes which were more or less similar to other species. The highest was 25 bp obtained between trnL1 and rrnL. Another conserved overlap of 7 bp was observed between atp8 and atp6, and nad4 and nad4l were also observed in Melon thrips[58]. The control region is the largest non-coding region obtained. Almost all the PCGs started with the start codon ATN; only the cox1 gene has TTG as its start codon, again observed in other species as well[41]. The genes end with complete stop codons except for cox2 and cox3. All the features obtained concerning the PCGs are ancestral and do not vary much from what is already known. The codon usage, AT, and GC skews were all similar to the range obtained for Hemipterans. Some striking differences have been found in both samples, but deflect from the common are: one of them is the duplication of tRNA-Threonine, which increased the number of tRNAs to 23. The duplicates are labeled as trnT_0 and trnT_1. The duplicate has a similar structure to trnS1 and resembled its isoform due to the reduction of the TѱC arm to a loop[37]. Similar observations were seen in Reduvius tenebrosus[59]. This novel rearrangement has been explained by the tandem duplication/random loss (TDRL) model. All the other tRNA structures predicted were clover-leaf, with the above two exceptions[60]. This feature has not been observed for previously sequenced mitogenomes but is seen for newly annotated ones, hinting at mutation and evolution. The second striking difference observed was the gene rearrangement of the intergenic space between trnH and nad4, which is considered a synapomorphy of Lygaeoidea, which was reduced to 11 bp, as opposed to 41 bp in the same super-family. The control region is relatively smaller in length with 962 bp. All the standard features of the control region, similar to the study of C. fallax, were observed except for the tandem repeats[61]. Microsatellite repeats replacing the tandem repeats were also observed in various species such as Panaorus albomaculatus[36].

Conclusion

The taxonomy of Hemiptera has always put up several challenging questions, and researchers across the globe have done extensive studies. The emphasis of this study was to find if there exist another cryptic sister species in the two extreme regions of India—Coimbatore, and Bhatinda, where cotton crop scientists witnessed different levels of bug infestation. As the study concludes, the mitogenomes were 99.9% identical and gave us an understanding of their presence in these two places. The mitogenome sequenced provides many genetic markers from PCGs to control regions for identification purposes. Our study is the first detailed mitochondrial genomic study in the genus Oxycarenus. We also have found some novel rearrangements in the mitogenome, which will help understand the genus's evolution and ecology. The results obtained from this study, along with the whole genome of this species when available, would provide possibly new and exciting prospects that could help in biocontrol and mitigating resistance of this economically important species. From the resistance to pesticide perspective, further studies involving whole transcriptomic approaches may lead to vital discoveries. Therefore, the data and the findings from this study would further help in the evolutionary studies of Oxycarenidae, Lygaeidae, and Hemiptera in the future. Supplementary Information.
  44 in total

1.  A hotspot of gene order rearrangement by tandem duplication and random loss in the vertebrate mitochondrial genome.

Authors:  Diego San Mauro; David J Gower; Rafael Zardoya; Mark Wilkinson
Journal:  Mol Biol Evol       Date:  2005-09-21       Impact factor: 16.240

Review 2.  Insect mitochondrial genomics: implications for evolution and phylogeny.

Authors:  Stephen L Cameron
Journal:  Annu Rev Entomol       Date:  2013-10-16       Impact factor: 19.686

3.  Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes.

Authors:  N T Perna; T D Kocher
Journal:  J Mol Evol       Date:  1995-09       Impact factor: 2.395

4.  MEGA: Molecular Evolutionary Genetics Analysis software for microcomputers.

Authors:  S Kumar; K Tamura; M Nei
Journal:  Comput Appl Biosci       Date:  1994-04

5.  Organization and dynamics of human mitochondrial DNA.

Authors:  Frédéric Legros; Florence Malka; Paule Frachon; Anne Lombès; Manuel Rojo
Journal:  J Cell Sci       Date:  2004-05-11       Impact factor: 5.285

6.  MITOS: improved de novo metazoan mitochondrial genome annotation.

Authors:  Matthias Bernt; Alexander Donath; Frank Jühling; Fabian Externbrink; Catherine Florentz; Guido Fritzsch; Joern Pütz; Martin Middendorf; Peter F Stadler
Journal:  Mol Phylogenet Evol       Date:  2012-09-07       Impact factor: 4.286

Review 7.  Hemipteran mitochondrial genomes: features, structures and implications for phylogeny.

Authors:  Yuan Wang; Jing Chen; Li-Yun Jiang; Ge-Xia Qiao
Journal:  Int J Mol Sci       Date:  2015-06-01       Impact factor: 5.923

8.  NOVOPlasty: de novo assembly of organelle genomes from whole genome data.

Authors:  Nicolas Dierckxsens; Patrick Mardulyn; Guillaume Smits
Journal:  Nucleic Acids Res       Date:  2017-02-28       Impact factor: 16.971

9.  Phylogenetic analysis of the true water bugs (Insecta: Hemiptera: Heteroptera: Nepomorpha): evidence from mitochondrial genomes.

Authors:  Jimeng Hua; Ming Li; Pengzhi Dong; Ying Cui; Qiang Xie; Wenjun Bu
Journal:  BMC Evol Biol       Date:  2009-06-15       Impact factor: 3.260

10.  Long-branch attraction and the phylogeny of true water bugs (Hemiptera: Nepomorpha) as estimated from mitochondrial genomes.

Authors:  Teng Li; Jimeng Hua; April M Wright; Ying Cui; Qiang Xie; Wenjun Bu; David M Hillis
Journal:  BMC Evol Biol       Date:  2014-05-07       Impact factor: 3.260

View more
  1 in total

Review 1.  Mechanisms of Mitochondrial Malfunction in Alzheimer's Disease: New Therapeutic Hope.

Authors:  Showkat Ul Nabi; Andleeb Khan; Ehraz Mehmood Siddiqui; Muneeb U Rehman; Saeed Alshahrani; Azher Arafah; Sidharth Mehan; Rana M Alsaffar; Athanasios Alexiou; Bairong Shen
Journal:  Oxid Med Cell Longev       Date:  2022-05-14       Impact factor: 7.310

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.