Literature DB >> 31346443

The complete mitochondrial genome of Rhynchocypris oxycephalus (Teleostei: Cyprinidae) and its phylogenetic implications.

Zhichao Zhang1,2, Qiqun Cheng1, Yushuang Ge1,3.   

Abstract

Rhynchocypris oxycephalus (Teleostei: Cyprinidae) is a typical small cold water fish, which is distributed widely and mainly inhabits in East Asia. Here, we sequenced and determined the complete mitochondrial genome of R. oxycephalus and studied its phylogenetic implication. R. oxycephalus mitogenome is 16,609 bp in length (GenBank accession no.: MH885043), and it contains 13 protein-coding genes (PCGs), two rRNA genes, 22 tRNA genes, and two noncoding regions (the control region and the putative origin of light-strand replication). 12 PCGs started with ATG, while COI used GTG as the start codon. The secondary structure of tRNA-Ser (AGN) lacks the dihydrouracil (DHU) arm. The control region is 943bp in length, with a termination-associated sequence, six conserved sequence blocks (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E, CSB-F), and a repetitive sequence. Phylogenetic analysis was performed with maximum likelihood and Bayesian methods based on the concatenated nucleotide sequence of 13 PCGs and the complete sequence without control region, and the result revealed that the relationship between R. oxycephalus and R. percnurus is closest, while the relationship with R. kumgangensis is farthest. The genus Rhynchocypris is revealed as a polyphyletic group, and R. kumgangensis had distant relationship with other Rhynchocypris species. In addition, COI and ND2 genes are considered as the fittest DNA barcoding gene in genus Rhynchocypris. This work provides additional molecular information for studying R. oxycephalus conservation genetics and evolutionary relationships.

Entities:  

Keywords:  DNA barcoding; Rhynchocypris oxycephalus; conservation genetics; mitochondrial genome; phylogenetic analysis

Year:  2019        PMID: 31346443      PMCID: PMC6635945          DOI: 10.1002/ece3.5369

Source DB:  PubMed          Journal:  Ecol Evol        ISSN: 2045-7758            Impact factor:   2.912


INTRODUCTION

Rhynchocypris oxycephalus (Cyprinidae, Cypriniformes, and Osteoichthyes) is a small cold water species (Figure 1), which generally inhabits higher altitudes and lower water temperatures, high dissolved oxygen, upstream of the stream or mountain stream with sand or stone (Liang, Sui, Chen, Jia, & He, 2014; Zhang et al., 2011). It is an omnivorous fish that usually feeds on invertebrates, aquatic insect larvae, or plant debris. R. oxycephalus often has a large population, acting as the dominant species and playing a crucial role in maintaining the balance of stream ecosystem (Park, Im, Ryu, Nam, & Dong, 2010). Due to poor diffusion ability, R. oxycephalus is an ideal materials for the study of freshwater fish biogeography.
Figure 1

Photograph of Rhynchocypris oxycephalus

Photograph of Rhynchocypris oxycephalus The phylogenetic relationship of genus Rhynchocypris was very complicated, and it is one of the long‐standing controversial scientific issues in the classification of the subfamily Leuciscinae. Formerly, genus Rhynchocypris was considered as synonym with genus Phoxinus (Nelson & Joseph, 1976). Based on isozyme, Ito, Sakai, Shedko, and Jeon (2002) found that genus Phoxinus and genus Rhynchocypris were two nature taxa with close relationship. Based on 16S rRNA and Cytb genes from the mitochondrial genome, Sasaki et al. (2007) found that relationship between genus Phoxinus and genus Rhynchocypris is a little farther and Rhynchocypris was sister group with genus Tribolodon and genus Pseudaspius. In above studies, phylogenetic relationship of genus Rhynchocypris is controversial and further research is needed. The typical vertebrate mitochondrial genome is circular, ranging in size from ~15 to 18 kb and generally containing 37 genes (13 protein‐coding genes, 22 tRNAs, and two rRNAs) and two noncoding regions (control region and putative origin of light‐strand replication; Sasaki et al., 2007). Because of its maternal inheritance, high mutation rate, and small molecular weight, mitochondrial DNA has been used as a good molecular marker in phylogenetic analysis. In addition, the mitochondrial gene fragments have different evolution rates, so different gene fragments can be applied to different species studies. For example, RNA has a slower evolution rate and relatively conservative genes, which is suitable for species research in the upper class. ND, COXI, and other genes are faster than RNA genes in rate of evolution, and they are suitable for phylogenetic analysis between species or genus. Due to the limitations of morphological classification methods, more and more molecular biology methods have been applied to fish species identification in recent years. DNA barcoding technology is the most widely used among them (Hogg & Hebert, 2004). DNA barcoding technology is a technique for rapidly identifying species by analyzing the DNA sequences of standard target genes. It can not only identify known species, but also discover new species and hidden species that cannot be identified by traditional taxonomic methods. Compared with traditional species identification methods, this technology has the advantages of high accuracy, high efficiency, and is not affected by the environment of the identified object, individual factors of individual development, and identification experts (Hebert, Ratnasingham, & Dewaard, 2003). In mitochondrial genomes, COI gene is commonly used for species identification of birds (Yoo et al., 2013), insects (Hajibabaei, Janzen, Burns, Hallwachs, & Hebert, 2006), and fishes (Ward, Zemlak, Innes, Last, & Hebert, 2005) and has achieved good effect. However, as DNA barcoding, COI gene is not suitable for all animal species. For example, Li, Liu, Li, Du, and Zhuang (2015) analyzed Clupeiformes with COI gene and found that although all species can be distinguished, the efficiency is ordinary. Under this situation, more mitochondrial genes should be used as animal DNA barcodings. For example Miya and Nishida (2000a, 2000b), Zardoya and Meyer (1996), and Chen, Chi, Mu, Liu, and Zhou (2008) considered that COI, COIII, ND2, ND4, ND5, and Cytb genes were the best molecular markers for phylogenetic analysis in the research of Vertebrate and Teleostean. So these genes have the potential to be good DNA barcodings. Relative to a mitochondrial gene fragment, the complete mitochondrial genome has complete mitochondrial genetic information with a large amount of information. It can reveal the evolution of mitochondrial molecules more comprehensively. In the classification of species, phylogenetic relationship based on the complete sequence of mitochondria can be used as a reference. In this study, we designed primers for the amplification of the full sequence of mitochondrial genome, which can also be used as references to amplify the full mitochondrial genome of other Cyprinid fishes, and determined the complete mitochondrial genome of R. oxycephalus. In addition, we described genome organization, gene arrangement, and characterization of R. oxycephalus. On this basis, we analyzed the complete mitochondrial genome of R. oxycephalus and aligned the sequence with other species to explore its phylogenetic relationship in Rhynchocypris and Leuciscus. Moreover, we aimed to find the effective DNA barcoding among Rhynchocypris species to facilitate the identification of Rhynchocypris species.

MATERIALS AND METHODS

Sample collection and DNA extraction

Individuals of R. oxycephalus were collected from Qingyang County, Anhui province, China, in May 2018. The muscle was preserved in 95% ethanol and stored at −20°C until DNA extraction was performed. Genomic DNA was extracted from the muscle using the Column mtDNA out kit (Sangon, Shanghai, China) and stored at −20°C until needed for PCR.

PCR amplification and sequencing

PCR primers were designed by Primer Premier 5.0 software (Lalitha, 2000) and were based on universal primers of fish mtDNA (Simon et al., 1994). In addition, we used 16 sets of specific primers to amplify overlapping segments of the complete mitochondrial genome in R. oxycephalus. The specific primers were designed based on the alignments of the relatively conserved regions of Phoxinus oxycephalus (GenBank accession nos.: NC_027273; Sui, Liang, & He, 2016 and NC_018818; Imoto et al., 2013), and the specific primer sequences are shown in Table 1. PCR amplification was performed in a 20 μl reaction volume containing about 10 μl Premix Taq, 1 μl template, 7 μl ddH2O, and 1 μl each primer. The amplification condition was an initial denaturation for 5 min at 95°C, followed by 35 cycles of denaturation for 30 s at 94°C, annealing for 30 s at 50–55°C, then extension at 72°C for 1 min followed by a final extension at 72°C for 10 min. PCR products were separated by 1.0% agarose gel electrophoresis. All PCR fragments were sequenced after separation and purification at Map Biotech Inc (Shanghai, China).
Table 1

The 16 primer combinations for amplifying the complete mitochondrial DNA of Rhynchocypris oxycephalus

Primer namePrimer sequence(5'–3')
Rhynchocypris‐1FGACGAGGAGCGGGCATCAGG
Rhynchocypris‐1RCGGGGTATCAAACTAAAGGTC
Rhynchocypris‐2FCCAACACCACAAACTAAACCAT
Rhynchocypris‐2RTCTAGCCATTCATACAGGTCTCT
Rhynchocypris‐3FCAACGAACCAAGTTACCCAAG
Rhynchocypris‐3RGTGCCCAAAAATAGTACGACTG
Rhynchocypris‐4newFAACCTGTTCGCCCCTCTACCT
Rhynchocypris‐4newRGGCAAGGAAGGCTGCGGATGT
Rhynchocypris‐5FCCTCTTAACGGCCTTTGGACT
Rhynchocypris‐5RTTCCAAACCCTCCAATAAGAA
Rhynchocypris‐6FGTGACAGCCGTCCTTCTCCTC
Rhynchocypris‐6RGTAAGTTTGGTTGAGACTATCGC
Rhynchocypris‐7NEWFACCCCTGTATGTCTTGAGCTC
Rhynchocypris‐7NEWRATTAGTTGATTGGTAAATCGGTTC
Rhynchocypris‐8FATAARACTGACTCCTGAACCTGA
Rhynchocypris‐8RGCCTGGAGAGCGGTAAAATAA
Rhynchocypris‐9FAGGAGTTATTACGCTGGACCC
Rhynchocypris‐9RGTTRAGGTTTTGTAGGCGGTC
Rhynchocypris‐10NEWFGGTTAGCATTTCATCGCACACA
Rhynchocypris‐10NEWRTGGGTTCGTTCATAGGCTGT
Rhynchocypris‐11FTGCCTACGACAAACAGACCTTA
Rhynchocypris‐11RGTGTAATCATGGCTACCAAGAA
Rhynchocypris‐12FGCGTTCGACACAAACATTAGCT
Rhynchocypris‐12RAATGGATTGTCCTCGCTGAT
Rhynchocypris‐13FTRGCACTGACAGGCACCCCAT
Rhynchocypris‐13RGTTYTAATTGTGGGTTTAATTGCT
Rhynchocypris‐14FAAAGRACGAGGGATAAGAAGGA
Rhynchocypris‐14RCCCTGTCTCGTGTAGAAAGAGCA
Rhynchocypris‐15FAGACCTCCTTGGCTTTGTAGTA
Rhynchocypris‐15RTGTTGGGTAACGAGGAGTATG
Rhynchocypris‐16FATGATAGAACCAGGGACACAAT
Rhynchocypris‐16NEWRTATTGCTCCTCCTAACCACCC
The 16 primer combinations for amplifying the complete mitochondrial DNA of Rhynchocypris oxycephalus

Sequencing assembling and annotation

The complete mitochondrial genome sequences were assembled and annotated with the software Geneious (Drummond et al., 2010). Locations of PCGs and rRNA genes were annotated by comparisons with genes From Phoxinus oxycephalus (NC_027273, NC_018818). PCG boundaries were identified by ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). The tRNA genes were identified using the Online Program tRNAscan‐SE 2.0 (http://lowelab.ucsc.edu/tRNAscan-SE/; Lowe & Chan, 2016) and used the software RNAstructure 5.6 (Reuter & Mathews, 2010) to predict the secondary structure of mitochondrial tRNA. The putative tRNAs that were not found by these two tools were identified based on sequence similarity to tRNAs of the other previously published Cyprinidae mitochondrial genome. The secondary structure of the putative origin of light‐strand replication was analyzed with the software RNAstructure 5.6 (Reuter & Mathews, 2010). Nucleotide composition and codon usage were analyzed by Mega 5.0 (Tamura et al., 2011). Composition skew analysis was carried out with the formula AT‐skew = [A − T]/[A + T] and GC‐skew = [G − C]/[G + C], respectively (Nicole & Thomas, 1995). The tandem repeats of putative control regions were analyzed with the Tandem Repeats Finder program (http://tandem.bu.edu/trf/trf.advanced.submit.html; Benson, 1999). The gene map of the R. oxycephalus mitochondrial genome was drawn by the online software Ogdraw (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html; Lohse, Drechsel, Kahlau, & Bock, 2013).

Sequence alignment and phylogenetic analysis

Sequence alignment and phylogenetic analyses of Rhynchocypris species were performed with 8 complete mitochondrial genomes of the Rhynchocypris species downloaded from GenBank. Multiple alignments of the mitochondrial gene sequences were used by Clustal X 1.83 (Jeanmougin, Thompson, Gouy, Higgins, & Gibson, 1998) with the default settings. Length of consensus sequences, amount of variable sites, Kimura 2‐Parameter (K2P) distance, and Ts/Tv ratios were calculated by the software Mega5.0 (Tamura et al., 2011). The Online Program Gblock 0.91b (http://www.phylogeny.fr/one_task.cgi?task_type=gblocks) with default settings was used to find the conserved regions of the sequence (Castresana, 2000). Before the establishment of phylogenetic tree, the substitution saturation of base was tested by DAMBE software with GTR distance (Xia, 2013). For likelihood ratio tests, Modeltest 3.7 (Posada & Crandall, 1998) and Akaike information criterion (AIC; Bozdogan, 1987) were used to determine the best‐fitting model of the analysis. Maximum likelihood (ML) analysis of the 13 PCGs in 8 species of Rhynchocypris fish was also used by Mega 5.0 (Tamura et al., 2011), with Acrossocheilus fasciatus used as outgroups. The support values of the ML tree were evaluated via a bootstrap test with 1,000 iterations. In this analysis, “GTR + G” model was considered as the best‐fit model. Further, to explore the evolutionary relationships within Leuciscus, the complete mitochondrial genome of seven other Leuciscus fish species was downloaded from the Genbank. Maximum likelihood analysis of the complete mitochondrion genome among Leuciscus fishes was performed using Mega 5.0 (Tamura et al., 2011), with Acrossocheilus fasciatus as outgroup. Bayesian (BI) analysis was carried out using MrBayes v.3.2.6 (Huelsenbeck & Ronquist, 2005). Bayesian posterior probabilities were estimated using the Markov chain Monte Carlo (MCMC) sampling approach. Bayesian analysis starts with a random tree, runs 4 Markov chains at the same time, samples once every 100 generations, removes 25% of the aging samples that start running, and builds a consistent tree with the remaining samples. The control region was removed in both analysis due to its large variability. For ML and BI analysis, an optimum model of GTR + I + G (nst = 6; rates = gamma) was selected by AIC in Modeltest 3.7 (Posada & Crandall, 1998).

The analysis of DNA barcoding

Six PCGs (COI, COIII, ND2, ND4, ND5, and Cytb) were selected as potential DNA barcoding of Rhynchocypris species to find the fittest one. We downloaded other 15 complete mitochondrial genome of Rhynchocypris species from Genbank including two sequences of Phoxinus oxycephalus (NC_027273; Sui et al., 2016, KP641342; Sui et al., 2016), two sequences of R. percnurus (AP009061; Imoto et al., 2013, NC_015360; Imoto et al., 2013), four sequences of R. lagowskii (KJ641843; Sun, Wang, & Wei, 2016, AP009147; Imoto et al., 2013, NC_015354; Imoto et al., 2013, KR091310; Unpublished), one sequence of R. percnurus mantschuricus (NC_008684; Saitoh et al., 2006), one sequence of R. p. sachalinensis (NC_015362; Imoto et al., 2013), two sequences of R. kumgangensis (NC_019614; Yun, Yu, Kim, & Kwak, 2012, AP011363; Unpublished), one sequence of R. semotilus (NC_029341; Imoto et al., 2013), and two sequences of R. oxycephalus jouyis (NC_018818; Imoto et al., 2013, AP011269; Miya et al., 2015). The variation rate, K2P interspecies, and intraspecies distances were calculated by Mega 5.0 (Tamura et al., 2011). Based on the K2P interspecies and intraspecies distances, Wilcoxon signed rank test was conducted in the software SPSS 19.0 (Field, 2013) to compare the differences in 6 PCGs.

RESULTS

Genome annotation and base composition

We obtained the mitochondrial genome sequence of R. oxycephalus and deposited it in NCBI with GenBank accession no. MH885043. The mitogenome of R. oxycephalus was a circular DNA molecule with 16,609 bp in length. As shown in Figure 2, the mitogenome organization of R. oxycephalus was similar to that of typical vertebrate mitochondrial genome, it contained 13 PCGs, 22 transfer RNA genes, 2 ribosomal RNAs, and 2 noncoding regions. The basic information and genomic structure of the gene sequences are shown in the Table 2. The mitogenome structure of R. oxycephalus showed that its position was consistent with most Cyprinidae fishes (Zhang, Yue, Jiang, & Song, 2009). The light chain (L chain) encoded only the ND6 gene and 8 tRNA genes (tRNA‐Gln, tRNA‐Trp, tRNA‐Ala, tRNA‐Asn, tRNA‐Cys, tRNA‐Ser, tRNA‐Glu, and tRNA‐Pro). Most mitochondrial genes were encoded on the heavy chain (H chain).
Figure 2

Gene map of the Rhynchocypris oxycephalus mitochondrial genome

Table 2

Characteristics of the mitochondrial genome of Rhynchocypris oxycephalus

FeatureLength/bpPositionStart codonStop codonAnticodonIntergenic nucleotideNumber of amino acidStrand
tRNAPhe 691–69  GAA0 H
12S rRNA95770–1026   0 H
tRNAVal 721,027–1098  TAC0 H
16S rRNA16901,099–2788   0 H
tRNALeu 762,789–2864  TAA1 H
ND19752,866–3840ATGTAA 4324H
tRNAIle723,845–3916  GAT−2 H
tRNAGln 713,915–3985  TTG1 L
tRNAMet 693,987–4055  CAT0 H
ND21,0454,056–5100ATGT 0348H
tRNATrp 715,101–5171  TCA1 H
tRNAAla 695,173–5241  TGC1 L
tRNAAsn 735,243–5315  GTT0 L
OL365,316–5351   −3 L
tRNACys 685,349–5416  GCA1 L
tRNATyr 715,418–5488  GTA1 L
COI15515,490–7040GTGTAA 0516H
tRNASer 717,041–7111  TGA3 L
tRNAAsp 747,115–7188  GTC13 H
COII6917,202–7892ATGT 0230H
tRNALys 767,893–7968  TTT1 H
ATP81657,970–8134ATGTAG −754H
ATP66848,128–8811ATGTAA −1227H
COIII7848,811–9594ATGT 0261H
tRNAGly 719,595–9665  TCC0 H
ND33499,666–10014ATGT 0116H
tRNAArg 6910,015–10083  TCG0 H
ND4L29710,084–10380ATGTAA 098H
ND41,38210,374–11755ATGTA −7460H
tRNAHis 6911,756–11824  GTG0 H
tRNaSer 6811,825–11892  GTC0 H
tRNALeu 7311,894–11966  TAG1 H
ND5183611,967–13802ATGTAA 0611H
ND652213,799–14320ATGTAA −4173L
tRNAGlu 6814,321–14388  TTC0 L
Cytb1,14114,391–15531ATGT 2380H
tRNAThr 7215,532–15603  TGT0 H
tRNAPro 7115,603–15673  TGG−1 L
D‐loop93615,674–16609   0 H
Gene map of the Rhynchocypris oxycephalus mitochondrial genome Characteristics of the mitochondrial genome of Rhynchocypris oxycephalus The total base composition of R. oxycephalus mitochondrial genome was A:28.7%, T:27.3%, C:26.2%, G:17.8%, and exhibited positive AT‐skew (0.030) and GC‐skew (0.196), which was consistent with the lowest frequency for G content in typical fishes’ mitochondrial genomes (Perna & Kocher, 1995). The overall A + T content of the mitochondrial genome of R. oxycephalus was 56.0%; such an A‐T‐rich pattern reflected the typical sequence feature of the vertebrate mitochondrial genome (Mayfield & Mckenna, 1978). The R. oxycephalus mitochondrial genome contained 25 overlapping nucleotides. These were located in 7 pairs of neighboring genes and varied in length from 1 to 7 bp; one of the longest overlap (7 bp) was located between ND4L and ND4, the other was located between ATP8 and ATP6. A total of 30 intergenic nucleotides were dispersed in 12 locations and ranged in size from 1 to 13 bp; the longest intergenic spacer (13 bp) was located between tRNA‐Asp and COII.

Protein‐coding genes

Among 13 PCGs of R. oxycephalus, there were 12 PCGs using ATG as the initiation codon except the COI gene, which used GTG as initiation codon. All COI genes in reported fishes used GTG as initiation codon. Thus, the feature that COI used GTG as initiation codon seemed to be prevalent among nontetrapod vertebrates (Saitoh et al., 2000). However, stop codons varied among 13 PCGs. Seven PCGs in R. oxycephalus mitochondrial genome ended with complete stop codons, including TAA (ND1, COI, ATP6, ND4L, ND5, and ND6) and TAG (ATP8), the rest six genes ended with incomplete stop codons, either TA (ND4) or T (ND2, COII, ND3, COIII, and Cytb), which were presumably completed as TAA after transcriptions (Anderson et al., 1981). The codon usage and the relative synonymous codon usage (RSCU) in R. oxycephalus mitochondrial genome are given in Table 3. It revealed that codons were abundant in A or T in third position. The codons that had relatively high content of G and C were likely to be abandoned. Codon distribution in R. oxycephalus is given in Figure 3. Codons per thousand codons (CDspT) of R. oxycephalus showed its preference to Leucine and Alanine.
Table 3

Codon usage in Rhynchocypris oxycephalus mitochondrial protein‐coding genes

CodonCountRSCUCodonCountRSCUCodonCountRSCUCodonCountRSCU
UUU(F) 113 1 UCU(S)441.07 UAU(Y) 76 1.35 UGU(C)141
UUC(F)1121UCC(S)581.41UAC(Y)370.65UGC(C)141
UUA(L)1371.33 UCA(S) 73 1.78 UAA(*)63.43 UGA(W) 95 1.58
UUG(L)310.3UCG(S)130.32UAG(*)10.57UGG(W)250.42
CUU(L)1421.37CCU(P)430.8CAU(H)380.75CGU(R)150.79
CUC(L)880.85CCC(P)701.31 CAC(H) 64 1.25 CGC(R)160.84
CUA(L) 168 1.63 CCA(P) 77 1.44 CAA(Q) 84 1.66 CGA(R) 37 1.95
CUG(L)540.52CCG(P)240.45CAG(Q)170.34CGG(R)80.42
AUU(I) 188 1.41 ACU(T)410.55 AAU(N) 62 1.11 AGU(S)140.34
AUC(I)780.59ACC(T)1111.5AAC(N)500.89 AGC(S) 44 1.07
AUA(M) 124 1.36 ACA(T) 121 1.64 AAA(K) 59 1.51 AGA(*)00
AUG(M)590.64ACG(T)230.31AAG(K)190.49AGG(*)00
GUU(V)791.25GCU(A)620.72GAU(D)270.7GGU(G)340.55
GUC(V)380.6 GCC(A) 145 1.69 GAC(D) 50 1.3 GGC(G)711.15
GUA(V) 96 1.52 GCA(A)1091.27 GAA(E) 71 1.45 GGA(G) 84 1.36
GUG(V)390.62GCG(A)280.33GAG(E)270.55GGG(G)580.94

Stop codons; the letters in brackets are abbreviations of each amino acid, preferred codons in bold.

Figure 3

Codon distribution in Rhynchocypris oxycephalus. CDspT‐codons per thousand codons

Codon usage in Rhynchocypris oxycephalus mitochondrial protein‐coding genes Stop codons; the letters in brackets are abbreviations of each amino acid, preferred codons in bold. Codon distribution in Rhynchocypris oxycephalus. CDspT‐codons per thousand codons

Ribosomal and transfer RNA genes

The 12S and 16S rRNA genes of R. oxycephalus mitochondrion were 957 and 1693 bp in length, respectively. As in other vertebrates, they were located between tRNA‐Phe and tRNA‐Leu (UUR) genes and separated by tRNA‐Val gene. The base composition of the two rRNA gene sequences was A: 28.6%, T:26.6%, C:21.1%, and G:23.7%. The A + T and G + C contents of the two rRNA were found to be 53.4% and 46.6%, respectively. The secondary structure of the animal tRNA gene was very similar. It showed a typical clover stem‐loop structure including four arms and four rings, one of which was a variable ring. According to its function, the four arms and the ring were, respectively named: amino acid accepting arm, dihydrouracil arm (DHU) and loop, anticodon arm and loop, TψC arm and loop, and a variable loop. The mitochondrial genome of R. oxycephalus contained 22 tRNAs, 14 of which were located on the heavy chain H chain and 8 are located on the light chain L chain with a gene length of 68–76 bp. The average base composition of 22 tRNAs was found to be A: 32.9%, G: 22.3%, T: 20.5%, C: 24.3%. All the tRNA genes included two tRNA‐Ser and tRNA‐Leu, while the other had only one. Among the 22 tRNA genes, tRNA‐Ser (AGY) lacks a DHU arm (Figure 4), and the rest were typical clover structures, accompanied by UU, AA, and GU mismatches. Compared with other Rhynchocypris species, most of the mismatched nucleotides were G‐U pairs, which could form a weak bond in tRNAs and noncanonical pairs in tRNA secondary structures (Gutell, Lee, & Cannone, 2002).
Figure 4

The secondary structures of the tRNA‐Ser(AGY) genes in Rhynchocypris oxycephalus

The secondary structures of the tRNA‐Ser(AGY) genes in Rhynchocypris oxycephalus

Noncoding regions

Like other vertebrates, there were two noncoding regions in R. oxycephalus mitochondrial genome. One was control region (D‐loop), and the other was putative origin of light‐strand replication (OL). Control region of R. oxycephalus mitochondrial genome was 943 bp in length, locating between tRNA‐Pro and tRNA‐Phe genes. It was also called A + T‐rich region with A + T content accounting for 65% of total base pairs, which was much higher than G + C content. Similar result was observed in other Cyprinidae species (Zhang et al., 2009). Control region consisted of termination‐associated sequence (TAS), central conserved domain (CCD), and conserved sequence block (CSB). TAS had an obvious hairpin structure (TACAT and ATGTA; Guo, Liu, & Liu, 2003). Liu (2002) identified three conserved sequence blocks (CSB‐D, CSB‐E, and CSB‐F) from CCD. In addition, previous studies on mammalian conserved sequence regions had found that there were generally three conserved sequences in CSB, which were named CSB1, CSB2, and CSB3, and speculated that this region was involved in heavy chain RNA primer generation (Walberg & Clayton, 1981). In addition, one repetitive sequence (AT) was found by the software Tandem Repeat Finder. This repetitive sequence was also found in other Cyprinidae species (Liu, 2002). By comparing with the nucleotide sequences of other Cyprinidae fishes, all sequences features were found (ETAS: 5′‐TACATATATATGTATTATCACCATTCATTTATCTTAACCTA‐3′; CSB‐F:5′‐ATGTAGTAAGAGCCCACC‐3′; CSB‐E: 5′‐CCAGGGACACAATATGTGGGGGT‐3′; CSB‐D: 5′‐TATTCCTTGCATCTGGTTCCTATTTCA‐3′; CSB‐1: 5′‐TTCATCATTAAAAGACATA‐3′; CSB‐2: 5′‐CAAACCCCCCTACCCCCC‐3′; CSB‐3: 5′‐TGTCAAACCCCGAAACCAA‐3′). All sequences features of R. oxycephalus in control region are shown as Figure 5.
Figure 5

Schematic map characterizing of the control region of Rhynchocypris oxycephalus. ETAS‐extended termination‐associated sequence, CSB‐conserved sequence blocks

Schematic map characterizing of the control region of Rhynchocypris oxycephalus. ETAS‐extended termination‐associated sequence, CSB‐conserved sequence blocks Putative origin of light‐strand replication (OL) was located in a cluster of five tRNA genes (WANCY region) between tRNA‐Asn and tRNA‐Cys which was similar to other vertebrates. The length of the putative origin of light‐strand replication was 36 bp. The region could fold into a stable stem‐loop secondary structure which included 12 bp loop area and 24 bp stem area. The stem‐loop structure was generally a characteristic of the origin of light‐strand replication, and it was closely related to the replication of mitochondrial DNA (Kawaguchi, Miya, & Nishida, 2001). The secondary structure of the putative origin of light‐strand replication in R. oxycephalus is shown in Figure 6.
Figure 6

The secondary structures of the putative origin of light‐strand replication gene in Rhynchocypris oxycephalus

The secondary structures of the putative origin of light‐strand replication gene in Rhynchocypris oxycephalus

Sequence alignment

To compare the differences among Rhynchocypris species, mitogenome sequences of other 7 Rhynchocypris species were downloaded from Genbank and included in this study (Table 4).
Table 4

Mitochondrial genome of the Rhynchocypris species used in this study

SpeciesLength (bp)A + T %AT‐skewGC‐skewAccession numberReference
R. oxycephalus 16,60956.00.0300.196 MH885043 This study
R. percnurus 16,60855.80.0340.204 KT359599 Unpublished
R. lagowskii 16,60355.70.0390.205 KF734881 Unpublished
R. perenurus mantschuricus 16,60257.90.0300.200 AP009061 Saitoh et al. (2006)
R. percnurus sachalinensis 16,59958.00.0330.214 AP009150 Imoto et al. (2013)
R. kumgangensis 16,60454.50.0380.221 JQ675733 Yun, Yu, Kim, and Kwak (2012)
R. semotilus 16,60555.70.0250.193 KT748874 Yu, Kim, and Kim (2017)
R. oxycephalus jouyi 16,60755.80.0250.191 AB626852 Imoto et al. (2013)
Mitochondrial genome of the Rhynchocypris species used in this study The complete mitochondrial genome of 13 PCGs, tRNA and their combined sequence, rRNA and their combined sequence was all aligned by Clustal X 1.83 (Jeanmougin et al., 1998), and the results are shown in Table 5.
Table 5

Some feature of mitochondrial genomes and different regions of 8 Rhynchocypris species

 Length of consensus sequenceAmount of variable sitesKimura 2‐Parameter distanceTs/Tv ratiosBase composition
TCAG
Mitochondrion genome15,6843,417 (21.8%)0.0974.7226.926.628.817.6
ND1975297 (30.5%)0.1414.1829.827.725.716.8
ND21,047360 (34.4%)0.1634.1326.031.127.615.4
COI1,551314 (20.2%)0.0885.9131.124.825.818.3
COII691139 (20.1%)0.0867.1428.525.229.117.2
ATP816540 (24.2%)0.1077.3423.929.034.812.3
ATP6684181 (26.5%)0.1174.0931.226.228.114.6
COIII785150 (19.1%)0.0907.0129.726.825.518.0
ND335193 (26.5%)0.1166.3229.328.426.016.3
ND4L29760 (20.2%)0.0865.6528.029.926.415.7
ND41,383395 (28.6%)0.1394.6929.127.328.015.7
ND51,839555 (30.2%)0.1454.5128.827.828.514.9
ND6522166 (31.8%)0.1476.3738.314.416.031.2
Cytb1,141275 (24.1%)0.1123.9330.227.026.816.0
12S rRNa95771 (7.4%)0.0294.9419.526.330.224.0
16S rRNA1,693177 (10.5%)0.0402.7821.123.134.421.3
Combined sequences of tRNA genes2,650249 (9.4%)0.0363.2320.524.332.922.3
Combined sequences of rRNA genes1,566124 (8.0%)0.0287.1226.621.128.623.7
Some feature of mitochondrial genomes and different regions of 8 Rhynchocypris species According to Brown, George, and Wilson (1979) and Knight and Mindell (1993) conclusions that the conversion ratio of the gene sequence was lower than 2.0, it was generally considered that the mutation had reached saturation and it was likely to be affected by the evolutionary noise, so special weighting must be carried out to ensure the comparison in the process of constructing the evolutionary relationship of the system with the correct information. It could be found that all of the Ts/Tv ratio was higher than 2.0, which indicated the conversion and transversion were not saturated. And it was suitable for phylogenetic analysis. In addition, It can be found that G content in the most segments was very low, which indicated an obvious antibias in the Guanine. According to variable sites and the Kimura‐2‐Parameter distance (Table 5), it could be found that ND2 had the maximum mutation rate (34.4%) and genetic distance (0.163) among 13 PCGs, which was in accordance with Qiao's (2014) conclusion. While COII had a small mutation rate and genetic distance, it could be indicated that the sequence was very conservative.

Phylogenetic analysis

Based on 13 PCGs of 8 Rhynchocypris species, we established a phylogenetic tree by maximum likelihood method with 1,000 replications which set Acrossocheilus fasciatus as outgroup. Before the phylogenetic analysis, we used software DAMBE (Xia, 2013) to analyze the substitution saturation of PCGs of Rhynchocypris species and compared the transition and transversion rate of mitochondrial DNA with GTR distance to verify whether there was mutation saturation in the Rhynchocypris species’ PCGs. The results showed that transition and transversion rate were not saturated and could be used for phylogenetic analysis (Figure 7a). In addition, Miya and Nishida (2000a, 2000b) suggested that ND6 gene should be excluded from phylogenetic analysis because of its heterogeneous base composition and consistently poor phylogenetic performance. So we established another phylogenetic tree excluded ND6 gene. The results of two phylogenetic analyses were almost the same. R. percnurus, R. oxycephalus, and R. o. jouyi appeared as sister group to R. lagowskii and R. semotilus. And the combined group could form a sister group with R. p. sachalinensis and R. p. mantschuricus. R. kumgangensis had a farther relationship with other Rhynchocypris species, but it could cluster with others. Two phylogenetic trees are shown in Figure 8a, b.
Figure 7

(a) Saturation plot for the substitutions of 13 protein‐coding genes; (b) saturation plot for the substitutions of the complete mitochondrial genome (excepted D‐loop)

Figure 8

(a) The maximum likelihood analyses of phylogenetic relationship based on 12 PCGs (excepted ND6) of 8 Rhynchocypris species. (b) The maximum likelihood analyses of phylogenetic relationship based on 13 PCGs of 8 Rhynchocypris species. Acrossocheilus fasciatus was selected as outgroup to root the tree in both (a) and (b)

(a) Saturation plot for the substitutions of 13 protein‐coding genes; (b) saturation plot for the substitutions of the complete mitochondrial genome (excepted D‐loop) (a) The maximum likelihood analyses of phylogenetic relationship based on 12 PCGs (excepted ND6) of 8 Rhynchocypris species. (b) The maximum likelihood analyses of phylogenetic relationship based on 13 PCGs of 8 Rhynchocypris species. Acrossocheilus fasciatus was selected as outgroup to root the tree in both (a) and (b) To further investigate the phylogenetic relationships of Leuciscus species, the phylogenetic relationships were reconstructed based on the complete mitochondrial genome. 17 species including one species of Pseudaspius, two species of Tribolodon, two species of Phoxinus, one species of Oreoleuciscus, and two species of Leuciscus were used to perform the phylogenetic analysis (Table 6). Because of the fast mutation rate in D‐loop region, this region was excluded from phylogenetic analysis. The maximum likelihood and Bayesian trees were constructed based on the complete mitochondrial genome (except D‐loop), with Acrossocheilus fasciatus as outgroup.
Table 6

Mitochondrial genome of the Leuciscus species used in this study

GenusSpeciesLength (bp)Accession numberReference
Pseudaspius P. leptocephalus 16,604 AP009058 Saitoh et al. (2006)
Tribolodon T.hakonensis 16,602 AB626855 Imoto et al. (2013)
T. brandtii 16,598 NC_018819 Imoto et al. (2013)
Phoxinus P. phoxinus 17,859 AP009309 Imoto et al. (2013)
P. ujmonensis 17,738 KJ000673 Xu et al. (2013)
Oreoleuciscus O. potanini 16,602 AB626851 Imoto et al. (2013)
Leuciscus L. burdigalensis 16,607 KT223568 Hinsinger et al. (2015)
L. waleckii 16,605 NC_018825 Wang et al. (2013)
Acrossocheilus A. fasciatus 16,589 KF781289 Cheng et al. (2015)
Mitochondrial genome of the Leuciscus species used in this study The results of substitution saturation showed that transition and transversion rate were not saturated and can be used for phylogenetic analysis (Figure 7b). The topology of the maximum likelihood and Bayesian trees constructed based on the complete sequence of the mitochondrial genome was identical. As the result, R. oxycephalus, R. percnurus, R. lagowskii, R. p. mantschuricus, R. p. sachalinensis, R. semotilus, and A. fasciatus were clustered as a monophyletic group. The group appeared as sister group to R. kumgangensis, Pseudaspius leptocephalus, Tribolodon hakonensis, and T. brandtii. The combined monophyly could form a sister group with Phoxinus phoxinus and P. ujmonensis. And the combined monophyly could form a sister group with Leuciscus burdigalensis and L. waleckii. Two types of phylogenetic trees are shown in Figure 9a, b.
Figure 9

(a) The phylogenetic relationship among 17 Leuciscus fishes based on the complete mitochondrial genome (excepted D‐loop) from maximum likelihood analyses. The bootstrap support values are shown above the branches. (b) The phylogenetic relationship among 17 Leuciscus species based on the complete mitochondrial genome (excepted D‐loop) from Bayesian analyses. Acrossocheilus fasciatus was selected as outgroup to root the tree in both (a) and (b)

(a) The phylogenetic relationship among 17 Leuciscus fishes based on the complete mitochondrial genome (excepted D‐loop) from maximum likelihood analyses. The bootstrap support values are shown above the branches. (b) The phylogenetic relationship among 17 Leuciscus species based on the complete mitochondrial genome (excepted D‐loop) from Bayesian analyses. Acrossocheilus fasciatus was selected as outgroup to root the tree in both (a) and (b)

The analysis of the DNA barcoding

We used the software MEGA 5.0 (Tamura et al., 2011) to calculate the amount of variable sites and variation rate among the 6 PCGs (COI, COIII, Cytb, ND2, ND4, and ND5) in the Rhynchocypris. The variation rate of the 6 PCGs was 23.0%, 24.8%, 31.8%, 42.1%, 34.7%, and 35.5%, respectively. The variation rate was relatively large. All of the 6 PCGs were considered as good DNA bar codes in Rhynchocypris species. The mean interspecies and intraspecies distance used by Kimura‐2‐Parameter model among 6 PCGs is shown in Figure 10. According to Figure 10, we could learn that ND2 had the maximum interspecies distance among 6 PCGs, while COI had the minimum. And the interspecies distance of ND2 was obviously larger than other 5 PCGs. The relationship among 6 PCGs is COI < COIII < Cytb < ND4 = ND5 < ND2. In addition, ND2 had the maximum intraspecies distance among 6 PCGs, while COI had the minimum. And the intraspecies distance of COI and COIII was obviously larger than other 4 PCGs. The relationship among 6 PCGs is COI < COIII < ND4 < ND5 < Cytb < ND2.
Figure 10

Kimura 2‐parameter distance among 6 selected protein‐coding genes. (a) interspecies (b) intraspecies

Kimura 2‐parameter distance among 6 selected protein‐coding genes. (a) interspecies (b) intraspecies According to the degree of differentiation of genetic variation, software SPSS 19.0 was used to perform the Wilcoxon test on 6 PCGs (Tables 6, 7a, b). We compared the test values of different segments, as the basis for segment filtering. By the significance level of p < 0.05, the result of the Wilcoxon test in interspecies distance in Rhynchocypris species was COI < COIII < Cytb < ND4 = ND5 < ND2. And the result of the Wilcoxon test in intraspecies distance in Rhynchocypris species was COI = COIII < ND5 < ND4 = Cytb <= ND2. The results were basically consistent with the results of the sequence alignment.
Table 7

(a) The Wilcoxon test of interspecific divergence among Rhynchocypris species; (b) The Wilcoxon test of intraspecific divergence among Rhynchocypris species

W+ W Relative ranks, n, p valueResult
(a)
COICOIIIW+ = 6,698, W = 19,408, n = 228, p < 0.001COI < COIII
COIND2W+ = 0, W = 26,106, n = 228, p < 0.001COI < ND2
COIND4W+ = 0, W = 26,106, n = 228, p < 0.001COI < ND4
COIND5W+ = 0, W = 26,106, n = 228, p<0.001COI < ND5
COICytbW+ = 399, W = 25,707, n = 228, p < 0.001COI < Cytb
COIIIND2W+ = 141, W = 25,965, n = 228, p < 0.001COIII < ND2
COIIIND4W+ = 607, W = 25,499, n = 228, p < 0.001COIII < ND4
COIIIND5W+ = 722, W = 25,384, n = 228, p < 0.001COIII < ND5
COIIICytbW+ = 5,162, W = 20,944, n = 228, p < 0.001COIII < Cytb
ND2ND4W+ = 25,965, W = 141, n = 228, p < 0.001ND2 > ND4
ND2ND5W+ = 25,808, W = 198, n = 228, p < 0.001ND2 > ND5
ND2CytbW+ = 24,235, W = 1871, n = 228, p < 0.001ND2 > Cytb
ND4ND5W+ = 14,286, W = 11,820, n = 228, p = 0.216ND4 = ND5
ND4CytbW+ = 3,852, W = 22,254, n = 228, p < 0.001ND4 > Cytb
ND5CytbW+ = 3,827, W = 22,279, n = 228, p < 0.001ND5 > Cytb
(b)
COICOIIIW+ = 48, W = 142, n = 25, p = 0.058COI = COIII
COIND2W+ = 0, W = 153, n = 25, p < 0.001COI < ND2
COIND4W+ = 0, W = 153, n = 25, p < 0.001COI < ND4
COIND5W+ = 0, W = 153, n = 25, P = p<0.001COI < ND5
COICytbW+ = 10, W = 180, n = 25, p = 0.001COI < Cytb
COIIIND2W+ = 22, W = 168, n = 25, p = 0.003COIII < ND2
COIIIND4W+ = 34, W = 156, n = 25, p = 0.014COIII < ND4
COIIIND5W+ = 37, W = 153, n = 25, p = 0.020COIII < ND5
COIIICytbW+ = 26, W = 184, n = 25, p = 0.003COIII < Cytb
ND2ND4W+ = 150, W = 3, n = 25, p < 0.001ND2 > ND4
ND2ND5W+ = 150, W = 3, n = 25, p < 0.001ND2 > ND5
ND2CytbW+ = 112, W = 78, n = 25, p = 0.494ND2 = Cytb
ND4ND5W+ = 150, W = 34, n = 25, p = 0.044ND4 > ND5
ND4CytbW+ = 72, W = 118, n = 25, p = 0.355ND4 = Cytb
ND5CytbW+ = 53, W = 137, n = 25, p = 0.091ND5 = Cytb
(a) The Wilcoxon test of interspecific divergence among Rhynchocypris species; (b) The Wilcoxon test of intraspecific divergence among Rhynchocypris species According to the theory of the ideal DNA barcoding by Meyer and Paulay (2005), the interspecies variation of the ideal DNA barcoding should be significantly larger than the intraspecies variation, and there should be a gap between the two, which called DNA barcoding gap. Distribution of interspecific and intraspecific variations of Rhynchocypris species in 6 PCGs is shown in Figure 11. We found that the average interspecies distance between 6 PCGs was larger than the intraspecies distance, and there were different degrees of overlap between intraspecies and interspecies distribution of each PCG. All 6 PCGs had no obvious DNA barcoding gap. However, COI, Cytb, and ND2 genes had less overlap between intraspecies and interspecies distribution which was beneficial to species differentiation.
Figure 11

Distribution of interspecific and intraspecific variations of Rhynchocypris species. (a) COI sequence; (b) COIII sequence; (c) Cytb sequence; (d) ND2 sequence; (e) ND4 sequence; (f) ND5 sequence

Distribution of interspecific and intraspecific variations of Rhynchocypris species. (a) COI sequence; (b) COIII sequence; (c) Cytb sequence; (d) ND2 sequence; (e) ND4 sequence; (f) ND5 sequence

DISCUSSION

Structural features of the mitochondrial genome of R. oxycephalus

In this study, the complete sequence of the mitochondrial genome of R. oxycephalus was obtained. R. oxycephalus had the same characteristics as other Cyprinidae species in mitochondrial genome structures, with a total length of 16,609 bp and a mitochondrial genome A + T content of 56.0% which was consistent with the A + T preference of vertebrates. It indicated that the order of mitochondrial genomes changes rarely, and it was suitable for solving the biological system developmental relationship of higher order elements such as families and subjects (Boore, 1999). Base G had the lowest content in the mitochondrial genome of R. oxycephalus. The phenomenon might be related to the way the mitochondrial gene is replicated. Specifically, the H chain replicated first, and when the H chain replication reached the origin of light‐strand replication, the L chain began to replicate. It caused a relatively long L chain in a single‐stranded state was prone to base mutations, resulting in a more stable G base being gradually replaced by other bases (Clayton, 1982). There were several intergenic regions and overlapping regions in the mitochondrial genome, including 12 intergenic regions and seven overlapping regions. This phenomenon was also common in other Cyprinidae species (Wu et al., 2009; Zhang et al., 2009). Among 13 PCGs of R. oxycephalus, like other vertebrates, except ND6 gene, all genes showed strong A + T bias and C base preference. ND6 gene was the PCG of the L chain, so it could be indicated that there were large base composition differences between the genes encoded by the H chain and the L chain. R. oxycephalus's PCGs start codon was relatively constant and had the general characteristics of bony fish (Chang, Huang, & Lo, 1994), while the stop codon changed greatly. Beside complete stop codons, there were two types of incomplete stop codons (T/TA). This phenomenon was widespread in the mitochondrial genome. It was not difficult to see the transcript of these protein sequences was U or UA at the 3' end. Due to the Ploy A at the 3' end of the mRNA, a complete stop codon could be formed by the addition of polyadenylation during processing (Ojala, Montoya, & Attardi, 1981). Among 22 tRNA genes of R. oxycephalus, in addition to tRNA‐Ser (AGY), the rest could fold into a typical clover structure. The tRNA‐Ser (AGY) lacked the DHU arm and formed a single‐loop structure at the position of the DHU arm. This structure was very common in fish mitochondrion (Lee & Kocher, 1995; Noack, Zardoya, & Meyer, 1996). Cheng et al.(2015) had shown that this tRNA lacking the DHU arm could adjust the structural morphology and it did not affect its ability to enter the ribosome and its ability to carry and transport amino acids. In addition, the putative origin of light‐strand replication was a region with a fast rate of evolution and a high degree of variation, which could fold into a stable stem‐loop secondary structure. Similar structures were found in fishes, amphibians, and mammals, but not in reptiles and birds (Ojala et al., 1981; Wolstenholme, 1992). Generally speaking, the control region of mitochondrial genome played an important role in regulating gene replication and transcription. On the other hand, the sequence length of the control region was also closely related to the length of the whole mitochondrial genome. The control region consisted of termination‐associated sequence, central conserved domain, and conserved sequence block. Termination‐associated sequence was the most variable part of the control region, which is involved in termination of DNA replication (Hai, Yang, Wei, Ming, & Hu, 2003). In termination‐associated sequence, there was an obvious hairpin structure (TACAT and ATGTA). Several TACAT sequences could also be found in downstream sequence (Lin et al., 2006). Central conserved domain was the most conservative zone in the control zone, and it was very conservative in almost all fishes. It could identify three conserved regions including CSB‐D, CSB‐E, and CSB‐F by comparing with other Cyprinidae species. Conserved sequence block could identify three conserved regions including CSB1, CSB2, and CSB3. It was presumed that this region was involved in the occurrence of H chain RNA primers (Walberg & Clayton, 1981). CBS2 and CBS3 were generally conservative in fish, but CBS1 varied greatly (Liu, Wu, Zhu, & Zhuang, 2010). The result of Tandem Repeats Finder analysis showed that there had an AT repetitive area among 816–847 bp in the control region. The sequence AT repeated 15 times. This area was also found in other Cyprinidae species (Liu, Tzeng, & Teng, 2002). Different repetition times of AT sequence resulted in different length of conservative sequence region of fish.

The phylogenetic relationships of Rhynchocypris species

In recent years, more and more researches on genus Rhynchocypris were presented. Imoto et al. (2013) considered that genus Tribolodon, Pseudaspius, Rhynchocypris from East Asia, and genus Oreoleuciscus are clustered as a monophyly. And this monophyly can form a sister group with genus Phoxinus. Xu (2013) combined mitochondrial genes (16SrRNA, Ctyb) with nuclear genes (Rag1, Rag2), analyzed three problems including: (a) whether Rhynchocypris species form a monophyletic group; (b) the phylogenetic position of genus Rhynchocypris; and (c) the intrageneric phylogeny of genus Rhynchocypris. Xu (2013) concluded that genus Rhynchocypris is a polyphyletic group and its phylogenetic position should be redefined. In this study, the maximum likelihood and Bayesian analyses were performed based on the complete mitochondrial genome and 13 PCGs of Rhynchocypris and Leuciscus species, and the topological structure of these two trees based on complete sequence of mitochondrial genome was identical. All the trees had high bootstrap supporting values. The result indicated that genus Rhynchocypris is a polyphyletic group and R. kumgangensis had distant relationship with other Rhynchocypris species. This conclusion was consistent with former results (Imoto et al., 2013; Xu, 2013). However, the phylogenetic position of R. p. sachalinensis and R. p. mantschuricus was different from Imoto's analysis (2013) which clustered them with R. percnurus. The possible reasons for these results might be the geographical difference of the selected fish and the different genes used for alignment. It showed that the phylogenetic relationship of certain species in Rhynchocypris is still not very clear. More genome sequences and more different Rhynchocypris species from different regions should be used for phylogenetic analysis to determine the relationship in Rhynchocypris species. In general, although a few consistent results were obtained, due to the small amount of samples, the phylogenetic relationship of Rhynchocypris species remains to be further analyzed and validated based on a wider range of species and more sequences combined with numerous analytical tools.

DNA barcoding of Rhynchocypris species

Nowadays, more and more people use different mitochondrial genes as DNA bar codes to identify animal species. By establishing a phylogenetic tree for 13 PCGs, Tang, Zheng, Ma, Cheng, and Li (2017) concluded ND5 gene had the potential to be DNA bar code for Octopodidae. The validation results generally in accordance with the traditional morphological classification. By analyzing the SNP loci, Sperling, Rosengarten, Moreno, and Dellaporta (2012) concluded that ND2 and ND5 genes can be used as a supplement of COI gene for DNA barcoding. In addition, Chen, Jiang, and Qiao (2012) used three gene sequences of COI, COII, and Cytb to verify the possibility of DNA barcoding technology in the identification of insect germplasm and proposed “TAG,” which is used as germplasm identification according to different TAG. The control region can also be barcoded by the TAG method though it is the hypervariable region of mitochondrion (Chen et al., 2012). In theory, the ideal DNA barcoding sequence should have large variation between species, small intraspecific variation, and DNA barcoding gap. In this study, the interspecies distance of the 6 PCGs we selected is all larger than the intraspecies distance. Relatively speaking, COI and ND2 genes have larger interspecies distance and smaller intraspecies distance. So, the effect of using these two PCGs to analyze the genetic distance is better than the other four PCGs. In addition, we can find the DNA barcoding gap in six PCGs. Moritz and Cicero (2004) suggested that if there are many closely related species in the collected samples, the overlap between the interspecies variation and the intraspecies variation will increase, so the DNA barcode gap may not exist. Another reason may be that there may be hybridization or genetic introgression between these species in the neighborhood, which will increase the overlap between interspecific and intraspecific variations. The phenomenon is also present in other Rhynchocypris species (Xu, 2013). Relatively speaking, COI, Cytb, and ND2 genes had less overlap between intraspecies and interspecies distribution. So, we concluded that COI and ND2 genes are suitable DNA bar codes for Rhynchocypris species. For different subjects, it is necessary to compare different DNA barcoding genes. In view of the fact that the published sequence of COI gene of Rhynchocypris species in GenBank is more than ND2 gene and COI has perfect universal primers, COI is more convenient and efficient to conduct research. So we recommend using COI sequence as the DNA bar code for the identification of Rhynchocypris species and ND2 gene can be used for assisted identification.

CONFLICT OF INTEREST

None declared.

AUTHORS CONTRIBUTION

QC conceived the ideas and designed the study; QC, ZZ, and YG performed the experiments and collected the data; ZZ and QC analyzed the data; QC, ZZ, and YG interpreted the results; ZZ and QC wrote the manuscript. All authors contributed critically to the drafts and gave final approval for publication.
  2 in total

1.  Molecular phylogeny of the genus Muntiacus with special emphasis on the phylogenetic position of Muntiacus gongshanensis.

Authors:  Yun-Chun Zhang; Ye Htet Lwin; Ren Li; Kyaw-Win Maung; Guo-Gang Li; Rui-Chang Quan
Journal:  Zool Res       Date:  2021-03-18

2.  The complete mitochondrial genome of Pomadasys kaakan (Perciformes: Haemulidae).

Authors:  Ming Chen; Jieluan Yang; Haobin He; Yupei Chen; Zhenhan Chen; Rishen Liang
Journal:  Mitochondrial DNA B Resour       Date:  2022-04-01       Impact factor: 0.658

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.