Literature DB >> 35586600

The complete mitochondrial genome of the Mexican-endemic cavefish Ophisternoninfernale (Synbranchiformes, Synbranchidae): insights on patterns of selection and implications for synbranchiform phylogenetics.

Adán Fernando Mar-Silva1,2, Jairo Arroyave3, Píndaro Díaz-Jaimes1.   

Abstract

Ophisternoninfernale is one of the 200+ troglobitic fish species worldwide, and one of the two cave-dwelling fishes endemic to the karstic aquifer of the Yucatán Peninsula, Mexico. Because of its elusive nature and the relative inaccessibility of its habitat, there is virtually no genetic information on this enigmatic fish. Herein we report the complete mitochondrial genome of O.infernale, which overall exhibits a configuration comparable to that of other synbranchiforms as well as of more distantly related teleosts. The KA/KS ratio indicates that most mtDNA PCGs in synbranchiforms have evolved under strong purifying selection, preventing major structural and functional protein changes. The few instances of PCGs under positive selection might be related to adaptation to decreased oxygen availability. Phylogenetic analysis of mtDNA comparative data from synbranchiforms and closely related taxa (including the indostomid Indostomusparadoxus) corroborate the notion that indostomids are more closely related to synbranchiforms than to gasterosteoids, but without rendering the former paraphyletic. Our phylogenetic results also suggest that New World species of Ophisternon might be more closely related to Synbranchus than to the remaining Ophisternon species. This novel phylogenetic hypothesis, however, should be further tested in the context of a comprehensive systematic study of the group. Adán Fernando Mar-Silva, Jairo Arroyave, Píndaro Díaz-Jaimes.

Entities:  

Keywords:  Blind swamp eel; Yucatan Peninsula; karst aquifer; mitogenome; systematics; troglobitic

Year:  2022        PMID: 35586600      PMCID: PMC8933388          DOI: 10.3897/zookeys.1089.78182

Source DB:  PubMed          Journal:  Zookeys        ISSN: 1313-2970            Impact factor:   1.546


Introduction

(, ), commonly known as the blind swamp eel, is a rare and elusive freshwater teleost fish endemic to the cenotes and submerged caves of the Yucatan Peninsula (YP) in southeastern Mexico. Like most troglobites, exhibits typical regressive troglomorphic traits associated with life in absolute darkness, such as the absence of both pigmentation and eyes. Besides its endemism and troglomorphism, is exceptional in that it is one of two fish species that permanently inhabit the dark and oligotrophic subterranean waters of the YP karst aquifer; the other being the Mexican blind brotula () (Arroyave 2020). The relative inaccessibility of its habitat–submerged caves or cenotes well inside dry caves–coupled with its highly cryptic lifestyle–often found burrowed under the sediment or hiding inside tangles of submerged roots and crevices–have made the study of the blind swamp eel particularly challenging, and as a result very little is known about this intriguing fish species. Notably, the total number of occurrence records for is less than 20 localities throughout its potential range of distribution (Arroyave et al. 2019). By virtue of its rarity, endemism, and restricted geographic distribution, in addition to the current threats faced by its habitat and region, has recently been categorized as Endangered (EN) (Arroyave et al. 2019). Unsurprisingly, genetic data from are virtually nonexistent, and this has hampered efforts at establishing its exact phylogenetic placement (Perdices et al. 2005). Besides their importance for phylogenetic and biogeographic research, genomic data are fundamental for addressing other evolutionary lines of inquiry, such as the genetic basis of troglomorphism (Protas and Jeffery 2012). Hence the need for generating genomic information of such a unique, endangered, and understudied species such as . In order to provide genomic resources potentially informative for future evolutionary studies, here we present the first complete mitochondrial genome of the troglomorphic and YP-endemic . In addition to sequencing, assembling, and annotating its mitogenome, we present detailed descriptive (genome size and organization, protein-coding genes [PCGs], non-coding regions, and RNAs features) and comparative (patterns of selection on PCGs, phylogenetic) analyses. Leveraging novel mitogenomic data to shed light on the systematics of is particularly relevant and timely because of ongoing conflicting hypotheses of relationships regarding the limits and composition of this teleost order that involve the phylogenetic placement of the monogeneric family with respect to synbranchiforms and closely related euteleost lineages (Van Der Laan et al. 2014; Nelson et al. 2016; Betancur-R et al. 2017). Furthermore, the phylogenetic position of the blind swamp eel, , in the context of the diversification of the family , has yet to be established (Perdices et al. 2005).

Material and methods

Sample collection and raw data generation

All methods were carried out in accordance with relevant guidelines and regulations, and the study was carried out in compliance with the ARRIVE guidelines. Sampling of the individual used to generate the mitogenome presented here was accomplished with the assistance of a professional cave diver who captured the specimen using a custom-made hand net specifically designed for efficient capture and secure storage while cave diving. The sample was collected under collecting permit SGPA/DGVS/05375/19 issued by the Mexican Ministry of Environment and Natural Resources (Secretaría de Medio Ambiente y Recursos Naturales; SEMARNAT) to JA. The sampling locality is the cenote Kan-Chin (Huhí, Yucatán), located at . The voucher specimen was euthanized with MS-222 prior to preservation in accordance with recommended guidelines for the use of fishes in research (Nickum et al. 2004), fixed in a 10% formalin solution, and subsequently transferred to 70% ethanol for long-term storage in the Colección Nacional de Peces (CNPE) of the Instituto de Biología (IB) at the Universidad Nacional Autónoma de México (UNAM), where it has been catalogued and deposited (CNPEIBUNAM 23285). A tissue sample (muscle fragment) was taken prior to specimen fixation, preserved in 95% ethanol, and eventually cryopreserved at -80 °C. High-molecular genomic DNA was extracted using the phenol-chloroform protocol (Sambrook et al. 1989). The DNA was sheared by sonication with a Bioruptor pico of Diagenode and Minichiller. Sonication was performed using six cycles of alternating 30 s ultrasonic bursts and 30 s pauses in a 4 °C water bath. For library preparation we used a DNA sample of 200 ng which was quantified using a Qubit fluorometer (Invitrogen). Library preparation was carried out using the KAPA Biosystem Hyper Kit (Kapa, Biosystem Inc., Wilmington, MA). Fragmented DNA was ligated to custom, TruSeq-style dual-indexing adapters (Glenn et al. 2016). Fragments were size selected in a ~300–500 bp range which was enriched through PCR, purified and normalized. The Illumina NextSeq v2 300 cycle kit was used for sequencing paired-end 150 nucleotide reads at the Georgia Genomics Facility, University of Georgia, Athens, USA.

Mitogenome assembly and annotation

The quality of the raw data was assessed with FastQC (Andrews, 2010). Good-quality sequences that did not contain ambiguous nucleotides and reads with average quality of 30Q were demultiplexed, trimmed and merged using Geneious Prime 2020.0.4 (https://www.geneious.com). Mitogenome assembly was conducted with MITObim v.1.9 (Hahn et al. 2013) using two reference mitogenomes from close relatives of available in GenBank: (MT436449) and (AP004439). These reference mitogenomes were aligned in order to generate a consensus sequence for use during the annotation procedure. MitoFish MitoAnnotator (Iwasaki et al. 2013) and MITOS (Bernt et al. 2013) were used to identify and annotate protein-coding genes (PCGs), transfer RNAs (tRNAs), and ribosomal RNAs (rRNAs). The resultant annotated mitochondrial genome was deposited in the GenBank database under accession number OM388306.

Descriptive analyses

Nucleotide and amino acid composition, codon usage profiles of protein-coding genes (PCGs), Relative Synonymous Codon Usage (RSCU), and characterization the non-coding mtDNA control region (CR) were computed with mega X (Kumar et al. 2018). Nucleotide composition skewness was calculated with the formulas AT skew = (A − T)/(A + T) and GC skew = (G − C)/(G + C) (Perna and Kocher 1995). Prediction of tRNAs secondary structure was accomplished with tRNAScan-SE 2.0 (Chan and Lowe 2019) through the webserver http://lowelab.ucsc.edu/tRNAscan-SE/, using Infernal without HMM filter search mode and “vertebrate mitochondrial” as sequence source (Lowe and Chan 2016). Analysis and prediction of CR secondary structure in was accomplished using the software ClustalW (Thompson et al. 2003) as implemented in mega X (Kumar et al. 2018) by comparison (via multiple sequence alignment) with reports of secondary CR structure from two other teleost fishes, namely (EU659698) (Zhao et al. 2006) and (MN603795) (Sharma et al. 2020).

Comparative analyses

We investigated patterns of selection on PCGs on a mitogenomic scale and phylogenetic relationships among major synbranchiform lineages based on all mitogenomic comparative data for the group available on GenBank. To measure of the strength and mode of natural selection acting on PCGs, we estimated the ratio of non-synonymous (KA) to synonymous (KS) substitutions (KA/KS, also known as ω or dN/dS) using the HyPhy 2.5 package (Kosakovsky Pond et al. 2020) as implemented in mega X (Kumar et al. 2018) based on the newly assembled mitochondrial genome (, OM388306) and seven additional synbranchiform mitogenomes previously available in GenBank: (MT436449), (AP004439), (NC003192), (NC023977), (NC035141), (KT443991), and (NC032080). To compare patterns of selection between synbranchiform families, we conducted two separate KA/KS analyses, one for synbranchids and one for mastacembelids. The taxonomic sampling for phylogenetic analyses included representatives of the synbranchiform families and , as well as a representative of , a monogeneric family historically classified in the on the basis of morphological evidence (Van Der Laan et al. 2014; Nelson et al. 2016) but more recently assigned to the in accordance to the results of molecular phylogenetic studies (Betancur-R. et al. 2013; Betancur-R et al. 2017). The lack of published mitochondrial genomes of fishes from the synbranchiform family prevented us from including representatives of this taxon in our analyses. The ingroup consisted of the synbranchids (OM388306), (MT436449), (AP004439) and (NC003192), the mastacembelids (NC023977), (NC035141), (KT443991) and (NC032080), and the indostomid (NC004401). The outgroup consisted of representatives of close relatives of such as the anabantiforms (NC030542) and (AP006809), and the gasterosteiforms (NC041244) and (NC011571); the last two included to test the phylogenetic position of with respect to members of the . The phylogeny was rooted at the viviparous brotula (AP004408). Phylogenetic relationships were inferred based on a concatenated alignment of all 13 PCGs. DNA sequence data from each PCG was independently aligned via multiple sequence alignment using the software MUSCLE (Edgar 2004) under default parameters via the “translation align” tool of the software Geneious Prime 2020.0.4 (https://www.geneious.com). The best-fit substitution model for each PCG was determined according to the corrected Akaike Information Criterion (AICc) with the software jModelTest2 (v. 2.1.10) (Darriba et al. 2012) under the following likelihood settings: number of substitution schemes = “3”; base frequencies = “+F”; rate variation = “+I and + G with nCat = 4”; base tree for likelihood calculations = “ML optimized”; and base tree search = “Best” (effectively evaluating among all 24 “classical” GTR-derived models). Individual alignments (ATP6=681 bp, ATP8=168 bp, =1,539 bp, COX2=690 bp, COX3=783 bp, =1,137 bp, NAD1=975 bp, NAD2=1,053 bp, NAD3=348 bp, NAD4=1,380 bp, NAD4L=294 bp, NAD5=1,836 bp, and NAD6=525 bp) were subsequently concatenated using the software 2matrix (Salinas and Little 2014), yielding a data matrix totaling 11,409 aligned bp. Maximum Likelihood inference of phylogeny was carried out on the concatenated alignment partitioned by gene using the software RAxML-NG (v. 1.0.1) (Kozlov et al. 2019) through the CIPRES Science Gateway (Miller et al. 2010), with nodal support estimated by means of the bootstrap character resampling method (Felsenstein 1985) based on 1000 pseudoreplicates.

Results and discussion

Genome size and organization

The complete mitochondrial genome of presented herein (GenBank accession number OM388306) is 16,804 bp in total length (Fig. 1; Table 1), a somewhat larger size than previously published synbranchiform mitogenomes, which range from 16,493 bp (in ) (or from 16,152 bp if considering the putative synbranchiform ) to 16,622 bp (in ). Although the mitogenome of the synbranchid reported in GenBank (AP004439) is considerably shorter (15,561 bp), this significant difference in length is actually due to it missing the NAD1 gene (normally ~1,000 bp) as a result of reported technical difficulties during sequencing (Miya et al. 2003). The composition and general arrangement of mitochondrial genes in is identical to that reported for other synbranchiforms (Li et al. 2016; Han et al. 2018; White et al. 2020) as well as for more distantly related teleosts (Miya et al. 2001, 2003; Satoh et al. 2016), and consists of a total of 37 genes divided into the following categories: 13 PCGs, 2 rRNAs, 22 tRNAs, and the non-coding control region (CR) (Fig. 1; Table 1). Twenty-eight genes (12 PCGs, 2 rRNAs, 14 tRNAs) plus CR are located on the H-strand, while the remaining nine genes (NAD6 and 8 tRNAs) are located on the L-strand (Table 1); a configuration that corresponds to those of previously reported synbrachiform mitogenomes (Li et al. 2016; Han et al. 2018; White et al. 2020). The overall base composition of the mitogenome is T=28.7%, A=31.6%, G=13.2%, and C=26.5%, which is fairly similar to those of other synbrachiform mitogenomes (Table 2). Nucleotide composition, however, is biased toward A+T (60.4%), with displaying the highest values of this metric among the analyzed synbranchiforms. The mitogenome of exhibits positive AT (0.046) and negative GC (-0.277) skewness, a general pattern shared with other species of the (Table 2).
Figure 1.

Annotated map of the mitochondrial circular genome of . The outer ring corresponds to the H- (outermost) and L-strands, and depicts the location of PCGs (in black, except for ND6 which is encoded in the L-strand and is portrayed in red), the non-coding control region (in dark brown), tRNAs (in red), and rRNAs (in light brown). The inner ring (black sliding window) denotes GC content along the genome. Live specimen photograph taken in the Cenote Kancabchen (Homún, Yucatán), courtesy of cave diver Erick Sosa.

Table 1.

Mitochondrial genes and associated features of . Intergenic space (IGS) described as intergenic (+) or overlapping nucleotides (–). AA = amino acid.

LocusTypeOne-letter codeStartEndLength (bp)Strand# of AAAnticodonStart codonStop codonIGS
tRNAPhe tRNAF16969HGAA0
12s rRNA rRNA701017948H0
tRNAVal tRNAV1018109174HTAC0
16s rRNA rRNA109227661092H0
tRNA-Leu tRNAL2767284074HTAA63
NAD1 Protein-coding29043872951H316ATGTAA7
tRNAIle tRNAI3880394970HGAT8
tRNAGln tRNAQ3958402871LTTG–1
tRNAMet tRNAM4028409770HCAT0
NAD2 Protein-coding409851441047H337ATGAGA–3
tRNATrp tRNAW5142521170HTCA1
tRNAAla tRNAA5213528169LTGC1
tRNAAsn tRNAN5283535573LGTT53
tRNACys tRNAC5409547567LGCA0
tRNATyr tRNAY5476554267LGTA1
COX1 Protein-coding554470821539H489GTGAGA–4
tRNASer tRNAS7127719771LTGA2
tRNAAsp tRNAD7200727071HGTC2
COX2 Protein-coding72737963691H225ATGT0
tRNALys tRNAK7964803673HTTT1
ATP8 Protein-coding80388205168H51ATGTAA–8
ATP6 Protein-coding81968878683H223ATGTA0
COX3 Protein-coding88799662784H249ATGT0
tRNAGly tRNAG9663973169HTCC0
NAD3 Protein-coding973210079348H112ATGGAC0
tRNAArg tRNAR100801014869HTCG0
NAD4L Protein-coding1014910445297H97ATATAA–5
NAD4 Protein-coding10439118191380H445ATGT0
tRNAHis tRNAH118201188869HGTG0
tRNASer tRNAS118891195264HGCT–1
tRNALeu tRNAL119521202473HTAG1
NAD5 Protein-coding12026138551830H598ATGTA–2
NAD6 Protein-coding1385214373522L172ATGT1
tRNAGlu tRNAE143751444369LTTC2
CYTB Protein-coding14446155861141H369ATGT0
tRNAThr tRNAT155871566276HTGT–1
tRNAPro tRNAP156621573069LTGG0
D-loop Non-coding15731168041074H0
Table 2.

Size and nucleotide composition of the complete synbranchiform mitochondrial genomes (and their concatenated PCGs) analyzed in this study. *NAD1 gene missing from published mitogenome.

SpeciesGenBank Accession #Entire genomeProtein-coding genes
Length (bp)A(%)T(%)C(%)G(%)AT(%)AT skewGC skewLength (bp)AT(%)AT skewGC skew
Ophisternoninfernale OM388306 1680431.628.726.513.260.40.046-0.2771144960.1-0.038-0.348
Ophisternoncandidum MT436449 1652631.527.527.913.1590.067-0.361137759.1-0.015-0.374
Synbranchusmarmoratus* AP004439 1556130.726.828.51457.50.067-0.3411052957.1-0.027-0.355
Monopterusalbus NC003192 1662228.927.229.414.556.10.03-0.341143054.9-0.052-0.356
Mastacembelusarmatus NC023977 1648729.125.330.914.754.40.069-0.3551140453.1-0.013-0.381
Mastacembeluserythrotaenia NC035141 164932924.531.614.953.40.086-0.3571141752.2-0.003-0.382
Macrognathusaculeatus KT443991 165433026.528.714.856.40.063-0.3221142055.9-0.014-0.345
Macrognathuspancalus NC032080 1654929.72629.614.755.70.664-0.3371142054.9-0.02-0.363
Mitochondrial genes and associated features of . Intergenic space (IGS) described as intergenic (+) or overlapping nucleotides (–). AA = amino acid. Size and nucleotide composition of the complete synbranchiform mitochondrial genomes (and their concatenated PCGs) analyzed in this study. *NAD1 gene missing from published mitogenome. Annotated map of the mitochondrial circular genome of . The outer ring corresponds to the H- (outermost) and L-strands, and depicts the location of PCGs (in black, except for ND6 which is encoded in the L-strand and is portrayed in red), the non-coding control region (in dark brown), tRNAs (in red), and rRNAs (in light brown). The inner ring (black sliding window) denotes GC content along the genome. Live specimen photograph taken in the Cenote Kancabchen (Homún, Yucatán), courtesy of cave diver Erick Sosa.

Protein-coding genes

The 13 PCGs, altogether totaling 11,449 bp, correspond to 68.1% of the mitogenome. These genes consist of seven regions that code for the subunits of the NADH dehydrogenase (ubiquinone) protein complex (NAD1-6, NADL4), three that code for the subunits of the enzyme cytochrome c oxidase (-3), one that codes for the enzyme cytochrome b (), and two that code for the subunits 6 and 8 of the enzyme ATP synthase FO (ATP6, ATP8). Except for and ND4L, PCGs exhibit an ATG (Met) start codon­, which is the standard in eukaryotic systems (Kozak 1983). The start codon exhibited by (GTG), however, is fairly common among vertebrates (Nwobodo et al. 2019). Conversely, an initiation-codon change from ATG (Met) to ATA (Ile) such as the one observed in ND4L is less common. Notably, of the synbranchiform mitogenomes analyzed, only that of displays ATA as ND4L initiation codon. Most PCGs (10 out of 13) exhibit a TAA stop codon, which is a standard termination codon common in vertebrate mtDNA. However, of these 10 genes only three (NAD1, NAD4L, ATP8) display a complete codon (TAA), while the remaining seven (ATP6, COX2, COX3, NAD4, NAD5, NAD6, ) contain an incomplete stop codon (either TA or T). Of the remaining three PCGs, NAD2 and have the stop codon AGA, while NAD3 has the stop codon GAC (Table 2). PCGs in the mitogenome of exhibit levels of A+T content (60.1%) comparable to–though slightly higher than–those of other synbranchiforms, which range from 53.1% in to 59.1% in (Table 2). In contrast to our findings for the entire mitogenome, AT-skews in PGCs across all synbranchiform mitogenomes analyzed exhibit negative values. Conversely, and in correspondence with our whole-mitogenome results, GC-skews in PGCs also exhibit negative values and highly similar across most analyzed synbranchiforms. A total of 3816 amino acids are encoded by PCGs in the mitogenome of , with Leu (14.7%), Ser (9.4%), Thr (7.8%), and Pro (7.7%) being the most frequent, while Met (1.1%) being the least common. RSCU values represent the ratio between the observed usage frequency of one codon in a gene sample and the expected usage frequency in the synonymous codon family, given that all codons for the particular amino acid are used equally. The synonymous codons with RSCU values > 1.0 have positive codon usage bias and are defined as abundant codons, whereas those with RSCU values < 1.0 have negative codon usage bias and are defined as less-abundant codons (Gun et al. 2018). Results from RSCU analysis of PCGs in the mitogenome of indicate that the most frequently used codons are ACC (1.59%), AAA (1.56%), TTA, ATA, and GAA (1.49%), which code for the amino acids Thr, Lys, Leu, Met, and Glu, respectively. On the other hand, codons encoding Prol (CCG, 0.16%), Thr (ACG, 0.2%), Ala (GCG, 0.23%), Ser (TCG, 0.39%), and Leu (CTG, 0.4%; TTA, 0.48%) are the least frequent (Fig. 2; Table 3).
Figure 2.

Results from analysis of Relative Synonymous Codon Usage (RSCU) of the mitochondrial genome of . Codon families are plotted on the x-axis. The label for the 2, 4, or 6 codons that compose each family is shown in the boxes below the x-axis, and the colors correspond to those in the stacked columns. RSCU values are shown on the y-axis.

Table 3.

Results from the Relative Synonymous Codon Usage (RSCU) analysis for the PCGs of the mitochondrial genome of .

Amino acidCodonNumberFreq. (%) RSCU Amino acidCodonNumberFreq. (%) RSCU
PheTTT1102.91.02AlaGCA481.31.12
TTC1052.80.98GCG100.30.23
LeuTTA1393.61.49TyrTAT1193.11.13
TTG451.20.48TAC922.40.87
CTT1443.81.54HisCAU591.51.08
CTC852.20.91CAC501.30.92
CTA1102.91.18GlnCAA7821.42
CTG3710.4CAG320.80.58
IleATT1343.51.17AsnAAT1012.61
ATC952.50.83AAC1022.71
MetATA1193.11.49LysAAA7821.56
ATG411.10.51AAG220.60.44
ValGTT310.81.18AspGAT340.91.1
GTC230.60.88GAC280.70.9
GTA320.81.22GluGAA501.31.49
GTG190.50.72GAG170.40.51
SerTCT731.91.22CysTGT300.80.98
TCC802.11.34TGC310.81.02
TCA882.31.47TrpTGA631.71.26
TCG230.60.39TGG3710.74
AGT421.10.7ArgCGT140.40.67
AGC521.40.87CGC220.61.06
ProCCT1042.71.42CGA280.71.35
CCC912.41.24CGG190.50.92
CCA862.31.17GlyGGT441.21.35
CCG120.30.16GGC360.91.11
ThrACT661.70.87GGA300.80.92
ACC1203.11.59GGG200.50.62
ACA1012.61.34StopTAA832.21.78
ACG150.40.2TAG3910.84
AlaGCT451.21.05AGA3710.8
GCC691.81.6AGG270.70.58
Results from the Relative Synonymous Codon Usage (RSCU) analysis for the PCGs of the mitochondrial genome of . Results from analysis of Relative Synonymous Codon Usage (RSCU) of the mitochondrial genome of . Codon families are plotted on the x-axis. The label for the 2, 4, or 6 codons that compose each family is shown in the boxes below the x-axis, and the colors correspond to those in the stacked columns. RSCU values are shown on the y-axis.

Transfer and ribosomal RNAs

The mitogenome of contains the typical 22 tRNAs usually documented for mitogenomes of other teleosts and vertebrates (Lee et al. 1995; Díaz-Jaimes et al. 2016; Satoh et al. 2016; Nwobodo et al. 2019; White et al. 2020). The genomic organization of tRNAs in is identical to that reported for (White et al. 2020) and other synbranchids (Li et al. 2016; Han et al. 2018). Altogether, tRNAs total 1547 bp, with individual ones ranging from 64 bp (tRNASer) to 76 bp (tRNAThr) (Table 1). Fourteen tRNAs are encoded in the H-strand, while the remaining eight in the L-strand (Fig. 1; Table 1). Twenty-one of the 22 tRNAs fold into the canonical cloverleaf secondary structure that consists of four domains (AA stem, D arm, AC arm, and T arm) and a variable loop (Fig. 3). Notably, the tRNASer (11889–11952) exhibits an unusual structure in which the D arm is missing. Although any change in tRNA secondary structure could potentially alter its amino acid recognition capability (Nwobodo et al. 2019), it has been shown that loss of the D arm does not necessarily imply reduced functionality; in fact, almost all tRNAsSer for AGY/N codons lack the D arm, and truncated tRNAs appear to have been compensated for by several interacting factors (Watanabe et al. 2014). Furthermore, among fishes, loss of the tRNASer D arm is not unique to , for it has been reported in several species, including chondrichthyans such as (Chen et al. 2013), (Chen et al. 2016), and (Zhu et al. 2017), as well as teleosts such as and (Bbole et al. 2018). Although most tRNAs present the canonical 7-bp T loop, nonstandard T-loop lengths were observed in tRNAMet (6 bp), tRNAPhe (8 bp), and tRNASer (9 bp). Other deviations from the traditional tRNA secondary structure that could affect functionality is the presence of extra loops. The tRNAArg (10080–10148) in the mitogenome of exhibits a loop at the base of the AA stem, thus potentially affecting aminoacylation. The nucleotide composition in the tRNAs of the mitogenome is T=29%, A=31.4%, G=20.2%, and C=19.3%. The genes that code for the mitochondrial 12S and 16S rRNA subunits in are 948 bp and 1092 bp long, respectively, and are located on the H-strand separated by the tRNAVal, just like in most teleost fishes (Lee et al. 1995; Satoh et al. 2016).
Figure 3.

Secondary structure of the 22 tRNA genes of the mitochondrial genome of predicted by tRNAScan-SE 2.0.

Secondary structure of the 22 tRNA genes of the mitochondrial genome of predicted by tRNAScan-SE 2.0.

Non-coding regions

The mtDNA control region of is 1074 bp long (15731–16804), encoded in the H-strand, and flanked by tRNAPro and tRNAPhe at the 5' and 3' ends, respectively (Fig. 1; Table 1), which is consistent with our understanding of mitogenome structure and organization in fishes (Lee et al. 1995; Rasmussen and Arnason 1999; Satoh et al. 2016). CR nucleotide composition is T=33.1%, A=36.7%, G=10.9%, and C=19.3%, with A+T content (69.8%) larger than that of the entire mitogenome but similar to that of other fishes including synbranchids (Li et al. 2016; Han et al. 2018). Like in other fishes, CR in is divided into three domains: a central conserved domain flanked and two hypervariable domains (upstream and downstream). Three conserved sequence blocks (CSBs) were detected at the central conserved domain (CSB-F, CSB-E, CSB-D) as well as at the downstream hypervariable region (CSB1, CSB2, CSB3) (Fig. 4). Although additional CBSs have been identified for the central conserved domain (CSB-B, CSB-C) in mammals (Southern et al. 1988), the three identified herein for are those commonly found in fishes (Broughton and Dowling 1994; Chen et al. 2012). The upstream hypervariable domain in the CR of has a length of 256 bp and includes two copies of the motif TACAT and three copies of palindromic motif ATGTA. A change in the motif sequence (TGCAT) was observed in and but not in . Compared to those from the central conserved domain, CSBs in the downstream hypervariable domain displayed larger variation across the three fish species compared. Notably, CSB2 and CSB3 were slightly more conserved than CSB1, a pattern that has been reported for other fishes (Chen et al. 2012).
Figure 4.

Comparison (multiple sequence alignment) of the mtDNA control region of with those of fellow teleosts and . The alignment displays the three canonical domains distinguished by Termination Associated Sequences (TAS) of the upstream hypervariable region (in red), central conserved domain blocks (CSB-F, CSB-E, CSB-D) (in blue), and conserved sequence blocks of the downstream hypervariable region (CSB-1, CSB-2 and CSB-3) (in green).

Comparison (multiple sequence alignment) of the mtDNA control region of with those of fellow teleosts and . The alignment displays the three canonical domains distinguished by Termination Associated Sequences (TAS) of the upstream hypervariable region (in red), central conserved domain blocks (CSB-F, CSB-E, CSB-D) (in blue), and conserved sequence blocks of the downstream hypervariable region (CSB-1, CSB-2 and CSB-3) (in green).

Patterns of selection on PCGs

Results from KA/KS analyses (Fig. 5) indicate that most mtDNA PCGs in synbranchiform fishes have evolved under strong purifying selection (KA/KS <<1), preventing major structural and functional protein changes. Exceptions to this general pattern were observed for and NAD6 in synbranchids (Fig. 5a) and for NAD4 and NAD6 in mastacembelids (Fig. 5b), where significant signals of positive selection were detected. Studies in different groups of animals, including cephalopods (Almeida et al. 2015), rodents (Tomasco and Lessa 2011), and humans (DeHaan et al. 2004), have linked amino acid replacements in NAD6 to adaptive selection to hypoxic conditions. Because numerous synbranchiform species are known to be fossorial and to inhabit low-oxygen waters, the observed signature of positive selection in NAD6 might be related to adaptation to decreased oxygen availability. Notably, a recent comparative mitogenomic study of the African tilapias and similarly uncovered a pattern of positive selection in NAD6 suggestive of adaptation in response to changing environments (Bbole et al. 2018). In contrast to the pattern observed for NAD6, selection in and NAD4 is completely conflicting between synbranchiform families. While in mastacembelids –like most mitochondrial genes–has evolved under purifying selection (KA/KS <1), the opposite happens in synbranchids. Although speculative at this point, the fact that half of our synbranchid dataset consists of troglomorphic cave-dwelling species ( and ) (vs. none in the mastacembelid dataset) could explain the observed differences in selection patterns. Compared to surface waters, subterranean waters such as those of karst environments that harbor populations of and (in Mexico and Australia, respectively) contain low dissolved oxygen (Huppop 2000). Because of its role in aerobic metabolism, might therefore be a target of directional selection promoting the evolution of more metabolically efficient variants in hypogean lineages (Boggs and Gross 2021). In contrast, the observed conflicting patterns of selection in NAD4–another gene involved in cellular respiration–between mastacembelids (positive) and synbranchids (purifying), do not seem to be readily explained by ecological differences related to cave life.
Figure 5.

Patterns of selection in mtDNA PCGs of synbranchiform fishes. Results from KA/KS ratio analysis on mitochondrial PCGs (x-axis) in synbranchiform fishes of the families (a) and (b).

Patterns of selection in mtDNA PCGs of synbranchiform fishes. Results from KA/KS ratio analysis on mitochondrial PCGs (x-axis) in synbranchiform fishes of the families (a) and (b).

Phylogeny and systematics of synbranchiform fishes

Our understanding of phylogenetic relationships in synbranchiform fishes is incipient compared to that of other teleost groups. Despite the fact that for the past two decades molecular systematics has been routinely employed to refine and update the classification of fishes and our knowledge of their evolutionary history (Betancur-R et al. 2017), a comprehensive molecular phylogeny of the has yet to be proposed. Apart from a phylogenetic study focused on Central American synbranchids (Perdices et al. 2005), no studies have investigated synbranchiform relationships using comparative DNA sequence data. Surprisingly, recent phylogenetic studies focused on higher-level relationships among major lineages of bony fishes (Betancur-R. et al. 2013; Betancur-R et al. 2017) resulted in the reassignment of armored sticklebacks (family , traditionally placed in the suborder , order ) to the order . Although the classification of indostomids as gasterosteoids had been previously questioned on the basis of mitogenomic evidence (Miya et al. 2003, 2005; Kawahara et al. 2008), it was not until the phylogenetic classification of Betancur et al. (2013, 2017) that the family was transferred to the order . This proposal, however, was not adopted by the most authoritative contemporary standard references of fish systematics (Van Der Laan et al. 2014; Nelson et al. 2016), on the grounds of lack of morphological support and the need for further corroboration. It should be noted that previous molecular phylogenetic studies that cast doubt on the traditional placement of indostomids, whether based on “legacy” markers (Betancur-R. et al. 2013; Betancur-R et al. 2017) or complete mitochondrial genomes (Miya et al. 2003, 2005; Kawahara et al. 2008), relied on a very limited representation of synbranchiform diversity. In contrast, our phylogenetic analysis used mitogenomic data from a comparatively larger taxon sampling that included eight synbranchiform species from five genera (, , , , and ) and two families (, ). Notably, our phylogenetic results (Fig. 6) corroborate the notion that indostomids are more closely related to synbranchiforms than to gasterosteoids. Nevertheless, contrary to the findings of studies that have recently challenged the traditional classification of indostomids with respect to synbranchiforms (Kawahara et al. 2008; Betancur-R. et al. 2013; Betancur-R et al. 2017), our inferred phylogenetic placement of does not render paraphyletic. With the caveat that our sampling of synbranchiforms and closely related lineages is only partial, our results imply that indostomids are in fact the sister lineage of the order . While this phylogenetic pattern (topology) might be considered sufficient for lumping indostomids with synbranchiforms, examination of relative branch lengths (Fig. 6) suggests that is indeed a highly divergent lineage. In order to acknowledge their genetic and morphological (Britz and Johnson 2002) distinctiveness, indostomids may in fact warrant an order of their own. Within , our results remarkably do not support the monophyly of the synbranchid genus , for is resolved as more closely related to than to (Fig. 6). While at first sight this novel finding of a sister-group relationship between and is certainly unexpected, this hypothesis might not be that far-fetched from a biogeographic perspective, and when considering both the striking external morphological similarity between the two genera and the taxonomic ambiguities surrounding the classificatory history of the group (Rosen and Greenwood 1976). is restricted to the New World and comprises three species: (Central and South America), (Madeira River basin, Bolivia), and (Pará, Brazil). as currently delimited exhibits an essentially Gondwanan distribution, with six valid species distributed in Middle America (, ), Australia (, ), South Asia and Western Pacific (), and West Africa (). Assuming that Gondwanan drift vicariance is the main process responsible for the present-day globally disjunct distribution of the genus (Rosen 1975), the split between the Mexican-endemic and the West Australian-endemic should be at least as old as the Middle Jurassic separation of Eastern Gondwana (Antarctica, Madagascar, India, and Australia) from Western Gondwana (South America and Africa), dated at ca 165 Ma (McLoughlin 2001). From this it follows that the split between and should be even older. Notably, the only phylogenetic study that has investigated divergence times via molecular dating in a group of synbranchiforms (Perdices et al. 2005) estimated a comparatively much younger age (< 20 Ma) for the split between and . Although marine dispersal and extinction could be invoked in an attempt to reconcile biogeographic patterns with our admittedly limited knowledge of the timescale of synbranchiform diversification, the paraphyly of remains problematic. Our phylogenetic results coupled with the abovementioned estimates of synbranchid divergence times (Perdices et al. 2005) lead us to hypothesize that perhaps New World species of ( and ) are in fact more closely related to species than to the remaining species. As such, New World species of would have to be transferred to the genus . This phylogenetic scenario is also compatible with a likely very recent origin of the cave-dwelling . Although there is virtually no information regarding the timing of origin and colonization of the fishes that inhabit the cenotes and submerged caves of the YP karstic aquifer (Arroyave et al. 2021), these aquatic habitats are supposed to be extremely young, effectively established not before 20,000 years ago, at the end of the last glacial maximum in the Northern Hemisphere, when rising sea levels eventually resulted in the flooding of karstic sinkholes and dry caves (Coke IV 2019). Such a recent origin for is certainly much easier to explain as a result of speciation from a fellow New World lineage, such as or . Regardless of the appeal and feasibility of these hypotheses concerning the systematics of New World in general and the origins of in particular, our phylogenetic findings and their interpretation need to be taken with caution because of their absolute reliance on mtDNA only. It is well known that the mitochondrial genome is effectively a single locus (Avise 2012), that individual gene and species trees are not always congruent (Maddison 1997), and that nuclear and mtDNA inheritance patterns are not always congruent either (Funk and Omland 2003). Notwithstanding these limitations, our results emphasize the pressing need for a comprehensive systematic and biogeographic study of synbranchiform fishes, ideally based on genome-wide sequence data.
Figure 6.

Phylogenetic relationships of major synbranchiform lineages. Molecular phylogeny based on comparative mitochondrial PCGs from relevant available mitogenomes and the newly generated herein for . Troglobitic cave-dwelling species are marked with an asterisk to distinguish them from surface-dwelling ones. Outgroup taxa not shown. Colored circles on nodes indicate degree of clade support as determined by bootstrap values.

Phylogenetic relationships of major synbranchiform lineages. Molecular phylogeny based on comparative mitochondrial PCGs from relevant available mitogenomes and the newly generated herein for . Troglobitic cave-dwelling species are marked with an asterisk to distinguish them from surface-dwelling ones. Outgroup taxa not shown. Colored circles on nodes indicate degree of clade support as determined by bootstrap values.

Conclusions

The first complete annotated mitochondrial genome of , herein reported, exhibits an organization and arrangement similar to that of other synbranchiform fishes as well as of more distantly related teleosts. Based on our comparative mitogenomic dataset, most mitochondrial PCGs in synbranchiforms appear to have evolved under strong purifying selection, which has prevented major structural and functional protein changes. The few instances of mtDNA PCGs under positive selection might be related to adaptation to decreased oxygen availability and the evolution of more metabolically efficient variants in hypogean synbranchiform lineages. Phylogenetic analysis of mtDNA comparative data from synbranchiforms and closely related taxa (including the indostomid ) corroborate the notion that indostomids are more closely related to synbranchiforms than to gasterosteoids, but without rendering the former paraphyletic. Our phylogenetic results also suggest that New World species of might be more closely related to than to the remaining species. This novel phylogenetic hypothesis, however, should be further tested in the context of a comprehensive systematic study of the group.
  42 in total

1.  Multiple sequence alignment using ClustalW and ClustalX.

Authors:  Julie D Thompson; Toby J Gibson; Des G Higgins
Journal:  Curr Protoc Bioinformatics       Date:  2002-08

2.  MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Authors:  Sudhir Kumar; Glen Stecher; Michael Li; Christina Knyaz; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2018-06-01       Impact factor: 16.240

3.  HyPhy 2.5-A Customizable Platform for Evolutionary Hypothesis Testing Using Phylogenies.

Authors:  Sergei L Kosakovsky Pond; Art F Y Poon; Ryan Velazquez; Steven Weaver; N Lance Hepler; Ben Murrell; Stephen D Shank; Brittany Rife Magalis; Dave Bouvier; Anton Nekrutenko; Sadie Wisotsky; Stephanie J Spielman; Simon D W Frost; Spencer V Muse
Journal:  Mol Biol Evol       Date:  2020-01-01       Impact factor: 16.240

4.  Structure of the mitochondrial DNA control region of the sinipercine fishes and their phylogenetic relationship.

Authors:  Jin-Liang Zhao; Wei-Wei Wang; Si-Fa Li; Wan-Qi Cai
Journal:  Yi Chuan Xue Bao       Date:  2006-09

5.  Phylogenetic studies of complete mitochondrial DNA molecules place cartilaginous fishes within the tree of bony fishes.

Authors:  A S Rasmussen; U Arnason
Journal:  J Mol Evol       Date:  1999-01       Impact factor: 2.395

6.  MITOS: improved de novo metazoan mitochondrial genome annotation.

Authors:  Matthias Bernt; Alexander Donath; Frank Jühling; Fabian Externbrink; Catherine Florentz; Guido Fritzsch; Joern Pütz; Martin Middendorf; Peter F Stadler
Journal:  Mol Phylogenet Evol       Date:  2012-09-07       Impact factor: 4.286

7.  Sequencing and characterization of the complete mitochondrial genome of Japanese Swellshark (Cephalloscyllium umbratile).

Authors:  Ke-Cheng Zhu; Yin-Yin Liang; Na Wu; Hua-Yang Guo; Nan Zhang; Shi-Gui Jiang; Dian-Chang Zhang
Journal:  Sci Rep       Date:  2017-11-10       Impact factor: 4.379

8.  RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference.

Authors:  Alexey M Kozlov; Diego Darriba; Tomáš Flouri; Benoit Morel; Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

9.  Complete mitochondrial genome of the porbeagle shark, Lamna nasus (Chondrichthyes, Lamnidae).

Authors:  Píndaro Díaz-Jaimes; Manuel Uribe-Alcocer; Douglas H Adams; Jose Miguel Rangel-Morales; Natalia J Bayona-Vásquez
Journal:  Mitochondrial DNA B Resour       Date:  2016-10-03       Impact factor: 0.658

10.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity.

Authors:  Robert C Edgar
Journal:  BMC Bioinformatics       Date:  2004-08-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.