Literature DB >> 27019632

Phylogenetic inference of calyptrates, with the first mitogenomes for Gasterophilinae (Diptera: Oestridae) and Paramacronychiinae (Diptera: Sarcophagidae).

Dong Zhang, Liping Yan, Ming Zhang, Hongjun Chu, Jie Cao, Kai Li, Defu Hu, Thomas Pape.   

Abstract

The complete mitogenome of the horse stomach bot fly Gasterophilus pecorum (Fabricius) and a near-complete mitogenome of Wohlfahrt's wound myiasis fly Wohlfahrtia magnifica (Schiner) were sequenced. The mitogenomes contain the typical 37 mitogenes found in metazoans, organized in the same order and orientation as in other cyclorrhaphan Diptera. Phylogenetic analyses of mitogenomes from 38 calyptrate taxa with and without two non-calyptrate outgroups were performed using Bayesian Inference and Maximum Likelihood. Three sub-analyses were performed on the concatenated data: (1) not partitioned; (2) partitioned by gene; (3) 3rd codon positions of protein-coding genes omitted. We estimated the contribution of each of the mitochondrial genes for phylogenetic analysis, as well as the effect of some popular methodologies on calyptrate phylogeny reconstruction. In the favoured trees, the Oestroidea are nested within the muscoid grade. Relationships at the family level within Oestroidea are (remaining Calliphoridae (Sarcophagidae (Oestridae, Pollenia + Tachinidae))). Our mito-phylogenetic reconstruction of the Calyptratae presents the most extensive taxon coverage so far, and the risk of long-branch attraction is reduced by an appropriate selection of outgroups. We find that in the Calyptratae the ND2, ND5, ND1, COIII, and COI genes are more phylogenetically informative compared with other mitochondrial protein-coding genes. Our study provides evidence that data partitioning and the inclusion of conserved tRNA genes have little influence on calyptrate phylogeny reconstruction, and that the 3rd codon positions of protein-coding genes are not saturated and therefore should be included.

Entities:  

Keywords:  Calyptratae.; Oestroidea; gene contribution; long-branch attraction; mitogenome; phylogeny; taxon sampling

Mesh:

Year:  2016        PMID: 27019632      PMCID: PMC4807417          DOI: 10.7150/ijbs.12148

Source DB:  PubMed          Journal:  Int J Biol Sci        ISSN: 1449-2288            Impact factor:   6.580


Introduction

During the last decade, phylogeny reconstruction of extant life forms has shifted from being mainly morphology-based to currently relying primarily on evidence from molecular data. Just as for morphological data, it is imperative that we acquire an in-depth understanding of the information content and suitability of the sequence data used in the analysis. The mitochondrial genome, with its multiple copies in each cell, as well as the design of conserved primers for broad-scale amplification 1, has been dominating until the advent of next generation sequencing 2. Still, as the mitochondrial genome appears to evolve faster than the nuclear genome 3-5, it should have the potential for a finer phylogenetic resolution (or higher branch support values) in fast-evolving groups. Mitogenomes are thought to be reliable markers for reconstructing phylogenies 5, but the use of mitochondrial data in deep phylogeny analyses has been criticized because of biases in nucleotide frequencies, high substitution rates of nucleotides, and different phylogenetic information content among genes 6-12. Further, although complete mitogenomes provide high phylogenetic resolution and the most accurate dating estimates, phylogenetic estimates based on complete and incomplete mitogenomes have yielded inconsistent phylogenies for insects 13, 14, and vertebrates as well 8, 15, indicating conflicts between the phylogenetic signals provided by the different mitochondrial genes. Mitogenomes are therefore of considerable interest in evolutionary studies, and the aim of the present paper is to investigate the phylogenetic signal contained in the mitogenome for members of the large subradiation of the schizophoran flies termed Calyptratae. The nearly 20 000 described species of calyptrate flies comprise the largest and ecologically most diverse clade within the schizophoran super-radiation 11, 16, 17. Several phylogenetic studies, utilizing morphological and molecular data, have been conducted at different taxonomic levels within this group, but the phylogenetic relationships within major parts of this fly radiation are still controversial 17-39. Inferring a robust phylogeny of the Calyptratae will require the combination of a sufficiently informative dataset with carefully selected terminal taxa 6, 40, 41. At present, there are 36 complete or near-complete mitogenome sequences of calyptrate species in GenBank, representing the muscoid families Anthomyiidae, Fanniidae, Muscidae, and Scathophagidae, as well as several families of the Oestroidea. We sequenced the complete mtDNA of the stomach bot fly Gasterophilus pecorum (Fabricius), as the first representative of the bot fly subfamily Gasterophilinae (Oestridae), and provide a near-complete mitogenome of Wohlfahrtia magnifica (Schiner) as the first representative of the subfamily Paramacronychiinae (Sarcophagidae). We use these data to reconstruct calyptrate phylogeny and discuss mitogenomic performance.

Materials and methods

Taxon sampling

Complete (or near-complete) mitogenomes from a total of 38 calyptrate taxa and 2 non-calyptrate taxa were obtained by downloading existing data from GenBank and adding two mitogenomes (Table 1). All four families of the muscoid grade are represented, but the Hippoboscoidea are not represented because no mitogenomes are available for that clade. The family-level classification of the Oestroidea is not yet settled, primarily because of the long-standing issue of resolving the non-monophyletic status of the blow flies 17, 33. We include representatives from the traditional Calliphoridae, the Tachinidae, the Sarcophagidae (adding the first species from subfamily Paramacronychiinae) and the Oestridae (adding the first species from subfamily Gasterophilinae). We include one representative from each of the tachinid subfamilies Exoristinae and Phasiinae but exclude Exorista sorbillans (Wiedemann) (Tachinidae) 66 because a BLAST search places it close to Drosophila and widely removed from other species of the family. No mitogenome is available for the small family Rhinophoridae. Two non-calyptrate outgroup taxa are chosen to root the tree: the lower brachyceran species Cydistomyia duplonotata (Ricardo) (Tabanidae) and the acalyptrate species Drosophila melanogaster (Meigen) (Drosophilidae).
Table 1

Summary of mitogenomes from Calyptratae, and two outgroup species.

SuperfamilyFamilySubfamilySpeciesLocusReference
TabanoideaTabanidaeTabanidaeCydistomyia duplonotata *NC_008756[14]
EphydroideaDrosophilidaeDrosophila melanogaster *NC_024511[42]
OestroideaCalliphoridaeCalliphorinaeAldrichina grahamiKP872701[43]
CalliphorinaeCalliphora vicinaNC_019639[44]
ChrysomyinaeCochliomyia hominivoraxNC_002660[45]
ChrysomyinaeChrysomya albicepsNC_019631[44]
ChrysomyinaeChrysomya bezzianaNC_019632[44]
ChrysomyinaeChrysomya megacephalaNC_019633[44]
ChrysomyinaeChrysomya pinguisNC_025338[46]
ChrysomyinaeChrysomya putoriaNC_002697[47]
ChrysomyinaeChrysomya rufifaciesNC_019634[44]
ChrysomyinaeChrysomya saffraneaNC_019635[44]
ChrysomyinaePhormia reginaNC_026668[48]
ChrysomyinaeProtophormia terraenovaeNC_019636[44]
LuciliinaeHemipyrellia ligurriensNC_019638[44]
LuciliinaeLucilia cuprinaNC_019573[44]
LuciliinaeLucilia porphyrinaNC_019637[44]
LuciliinaeLucilia sericataNC_009733[49]
PolleniinaePollenia rudisJX913761[44]
SarcophagidaeParamacronychiinaeWohlfahrtia magnificaPresent study
SarcophaginaeRavinia pernixNC_026196[50]
SarcophaginaeSarcophaga africaNC_025944[51]
SarcophaginaeSarcophaga crassipalpisNC_026667[52]
SarcophaginaeSarcophaga impatiensNC_017605[53]
SarcophaginaeSarcophaga melanuraNC_026112[54]
SarcophaginaeSarcophaga peregrinaNC_023532[55]
SarcophaginaeSarcophaga portschinskyiNC_025574[56]
SarcophaginaeSarcophaga similisNC_025573[57]
TachinidaeExoristinaeElodia flavipalpisNC_018118[58]
DexiinaeRutilia goerlingianaNC_019640[44]
OestridaeCuterebrinaeDermatobia hominisNC_006378[59]
GasterophilinaeGasterophilus pecorumPresent study
HypodermatinaeHypoderma lineatumNC_013932[60]
Muscoid gradeAnthomyiidaeDelia platuraKP901268[61]
FanniidaeEuryomma sp.KP901269[61]
MuscidaeMuscinaeHaematobia irritansNC_007102[62]
MuscinaeMusca domesticaNC_024855[63]
MuscinaeStomoxys calcitransDQ533708[62]
ReinwardtiinaeMuscina stabulansNC_026292[64]
ScathophagidaeScathophaginaeScathophaga stercorariaNC_024856[65]

* Species used as outgroups in subgroup_1.

DNA extraction, amplification, sequencing, and annotation

A single larva of G. pecorum, collected in 99.5% ethanol in Kalamaili, Xinjiang Province, China, in 2013, and a dry specimen of male W. magnifica, hatched from a larva collected in Xinjiang Research Centre for Breeding Przewalski's Horse, Ürümqi, Xinjiang, China, in 2014, served as sources of mitogenomic DNA. Muscular tissue was transferred to an Eppendorf tube, incubated in splitting solution with proteinase K (Beijing Cowin Biosciencee Co., Ltd., China) at 56℃ for three hours, and DNA was isolated with phenol-chloroform (MYM Biological Technology Ltd., India). After extraction in ethanol, genomic DNA was dissolved in Tris-EDTA buffer and stored at -20℃ until use. The mitochondrial genomes were amplified by Polymerase Chain Reaction (PCR) using 16 primer pairs (Table S1) synthesized by Beijing Genomics Institute. Genomic DNA (1 μL) was added to the PCR reaction mix (24 μL), which contained 17.2 μL sterilized distilled water, 2.5 μL of 10× Es Taq PCR Buffer (Beijing Cowin Biosciencee Co., Ltd., China), 2 μL of dNTPs mixture (2.5 mM each) (Beijing Cowin Biosciencee Co., Ltd., China), 1 μL of each primer (10 μM), and 0.3 μL of Es Taq DNA Polymerase (1.5 U) (Beijing Cowin Biosciencee Co., Ltd., China). The PCR was performed in a thermal cycler, using the cycling protocols reported in Table S1 for each primer combination. The polymerase activation (95℃, 10 min), the denaturation (95℃, 1 min), the final extension (72℃, 7 min) and the number of cycles (n = 35) were identical in all of the protocols 60. The PCR products were detected via electrophoresis in 0.5% agarose gels and sized by comparison with markers (Beijing Cowin Biosciencee Co., Ltd., China). Gels were photographed using a gel documentation system. Amplicons were sequenced bidirectionally using the BigDye® Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Inc., USA), and then purified using BigDye XTerminator® Purification Kit (Applied Biosystems, Inc., USA). Sequences were read with Chromas (Technelysium, Ltd., Australia), after running an ABI 3730xl DNA analyzer (Applied Biosystems, Inc., USA.). The software BioEdit version v7.0.9.0 69 and SeqMan (DNAStar, Steve ShearDown, 1998-2001 version, DNASTAR Inc., USA) were used for assembly of raw sequences. Gene boundaries and genomic positions of protein-coding genes (PCGs), ribosomal RNA genes, and transfer RNA genes were identified by BLAST search 70, MITOS search 71, and by comparing with other calyptrates using DNAMAN software (version 8, Lynnon Corp., Canada). Nucleotide composition and codon usage were calculated using MEGA 5.2 72.

Estimates of genetic divergence of mitochondrial genes

Genetic divergence was evaluated by calculating the mean evolutionary distances of each of the PCGs, rRNA genes, and concatenation of 13 PCGs, 2 rRNA genes, and 22 tRNA genes respectively from MEGA 5.2 under an uncorrected p-distance model. Uncorrected p-distances were calculated in MEGA 5.2 and obtained between species and averaged for each gene or combination of genes at three levels: within subfamily, within family, and within superfamily. The percentage of gaps and percentage of conserved positions for each of the 13 PCGs were calculated following Nardi et al. 73.

Nucleotide substitution saturation

Nucleotide substitution saturation analyses were performed for each of the 13 PCGs, for the 1st & 2nd codon positions combined, and for the 3rd codon positions, using MEGA 5.2 following Huang 74.

Phylogenetic analysis

Phylogenetic analyses were conducted with (subgroup_1) or without (subgroup_2) the non-calyptrate outgroups (Table 2).
Table 2

Summary of datasets used to perform phylogenetic analyses.

TaxaDatasetSubanalysesModels
subgroup_1 (40 species, 2 outgroups)ALL: PCGs, rRNA genes, and a concatenation of tRNA genes.Not partitionedGTR + I + G
Partitioned by genesP1 (ND2, COI, COII, ATP8, ATP6, COIII, ND3, ND6, CYTB) GTR + I + GP2 (ND5, ND4, ND4L, ND1) GTR + I + GP3 (lrRNA, srRNA, tRNA) GTR + I + G
Without 3rd codon positions of PCGsGTR + I + G
PCG&rRNA: PCGs, rRNA genes.Not partitionedGTR + I + G
Partitioned by genesP1 (ND2, COI, COII, ATP8, ATP6, COIII, ND3, ND6, CYTB) GTR + I + GP2 (ND5, ND4, ND4L, ND1) GTR + I + GP3 (lrRNA, srRNA) GTR + I + G
Without 3rd codon positions of PCGsGTR + I + G
subgroup_2 (38 species, rooted using Euryomma as outgroup)ALL: PCGs, rRNA genes, and a concatenation of tRNA genes.Not partitionedGTR + I + G
Partitioned by genesP1 (ND2, COI, COII, ATP8, ATP6, COIII, ND3, ND6, CYTB) GTR + I + GP2 (ND5, ND4, ND4L, ND1) GTR + I + GP3 (lrRNA, srRNA, tRNA) GTR + I + G
Without 3rd codon positions of PCGsGTR + I + G
PCG&rRNA: PCGs, rRNA genes.Not partitionedGTR + I + G
Partitioned by genesP1 (ND2, COI, COII, ATP8, ATP6, COIII, ND3, ND6, CYTB) GTR + I + GP2 (ND5, ND4, ND4L, ND1) GTR + I + GP3 (lrRNA, srRNA) GTR + I + G
Without 3rd codon positions of PCGsGTR + I + G
For both subgroups, two phylogenetic analyses were conducted: (1) using PCGs and rRNA genes, (2) using PCGs, rRNA genes and the concatenated tRNA genes. Phylogenetic trees were constructed using three partitioning approaches: (1) not partitioned; (2) partitioned by gene, and; (3) excluding 3rd codon positions of PCGs, not partitioned. Each of the mitochondrial genes were aligned separately using MUSCLE 75, as implemented in MEGA 5.2. For protein-coding genes, nucleotide sequences were aligned after translation into amino acid sequences by reference, and then back-translated for analysis as nucleotide sequences. The 22 tRNA genes were aligned separately and then concatenated as a combined partition, as all individual tRNA genes are very short. Amino acid and nucleotide sequences were aligned using default settings. Individually aligned genes were then concatenated using SequenceMatrix 76 for phylogenetic analyses. MrModeltest 2.3 77 was used to select the best model for the non-partitioned data set, and for the data set that excluded the 3rd codon position of PCGs. PartitionFinder v1.1.1 78 was used to evaluate the best partitioning scheme for the partitioned datasets (Table 2), after using the “greedy” algorithm with branch lengths estimated as ''unlinked”, following the Bayesian Information Criterion (BIC). Phylogenetic trees were inferred using Bayesian, and Maximum Likelihood (ML) methods. Bayesian analyses were performed with MrBayes v3.2.1 79 on CIPRES (Cyberinfrastructure for Phylogenetic Research) Science Gateway 80. Two independent runs were conducted, each with four chains (one cold and three hot chains), for 10 million generations and samples were drawn every 1000 generations. The first 25% of steps were discarded as burn-in. For both partitioned and unpartitioned data, the Maximum Likelihood analyses were performed with the RaxML 81, using the rapid hill-climbing algorithm starting from 100 randomized maximum parsimony trees. Node supports were evaluated via bootstrap tests with 1000 iterations.

Phylogenetic examination of separate genes

The Ktreedist program 82 was used to evaluate the relative contribution of each single mitochondrial gene to the construction of the phylogenetic tree. Bayesian analyses were performed using each of the 13 PCGs and the two rRNA genes as described above, and k-scores were calculated by comparing each of these trees with a reference tree obtained by a Bayesian analysis of the non-partitioned dataset of 13 PCGs and two rRNA genes for subgroup_2.

Results

General features of the Gasterophilus pecorum and Wohlfahrtia magnifica mitogenomes

The complete mitogenome of G. pecorum (15 750 bp) and near-complete mitogenome of W. magnifica (14 705 bp) were sequenced (GenBank accession numbers KU578262-KU578263). The control region of W. magnifica could not be amplified, resulting in the failure to sequence tRNAIle. Both sequences are similar to all known calyptrates, both in the order and orientation of genes. They are circular molecules containing all 37 genes usually present in bilaterians: 13 PCGs, 22 tRNA genes, and 2 rRNA genes (Fig. 1, Table S2). The control regions are located at the same site (between srRNA and tRNAIle genes) as found in other calyptrate flies for which the complete mtDNA sequences are available 44, 45, 47, 49, 58-60, 62.
Figure 1

Mitochondrial genome maps of Gasterophilus pecorum (Fabricius) and Wohlfahrtia magnifica (Schiner). Gene names without underline indicate that these genes are coded on (+) strand, while those with underline are on the (-) strand. Transfer RNA (tRNA) genes are designated by single-letter amino acid codes. White colored regions indicate those failed to sequence.

The start codons of all protein-coding genes were compared with data available from other dipterans. All protein-coding genes, except COI, have one of the common start codons for mitochondrial DNA: ATG, ATA, or ATT (Table S2). The start codon of COI was identified as TCG. COI has been reported using a nonstandard start codon in species belonging to the Calyptratae 43-65. TAA, TAG and T stop codons were found (Table S2), and further details about the mitogenome of G. pecorum and W. magnifica are presented in Table 3, Table S2, and Fig. S1a, S1b. Specifically for G. pecorum, the AT content of the complete genome, lrRNA gene, srRNA gene and control region, 70.7%, 75.4%, 72.3% and 80.8% respectively, are the lowest when compared with other species of the Calyptratae (Table 3, Table S3).
Table 3

Nucleotide composition of Gasterophilus pecorum (Fabricius) / Wohlfahrtia magnifica (Schiner).

RegionNucleotide composition (%)AT-skewGC-skew
T(U)CAGA+T
Whole genome32.4 / N *18.9 / N38.4 / N10.3 / N70.7 / N0.08 / N-0.29 / N
Protein-coding genes39.5 / 43.316.2 / 12.829.0 / 31.315.3 / 12.668.5 / 74.6-0.15 / -0.16-0.03 / -0.01
1st codon position33.9 / 36.614.6 / 12.330.1 / 31.621.3 / 19.564.0 / 68.2-0.06 / -0.070.19 / 0.23
2nd codon position45.9 / 46.320.2 / 19.519.4 / 20.114.5 / 14.065.3 / 66.4-0.41 / -0.39-0.16 / -0.16
3rd codon position38.7 / 46.913.7 / 6.537.6 / 42.310.0 / 4.276.3 / 89.2-0.01 / -0.05-0.16 / -0.21
tRNA38.1 / N10.8 / N37.4 / N13.6 / N75.5 / N-0.01 / N0.11 / N
lrRNA42.1 / 41.56.9 / 6.433.3 / 38.817.6 / 13.375.4 / 80.3-0.12 / -0.030.43 / 0.35
srRNA38.2 / 37.59.3 / 8.534.1 / 37.718.4 / 16.372.3 / 75.2-0.06 / 0.000.33 / 0.31
Control Region45.1 / N4.9 / N35.8 / N14.3 / N80.8 / N-0.12 / N0.49 / N

* N = Not available.

Genetic divergence

The average p-distances of PCGs show greater variation than that of rRNA genes, and ND6 and ATP8 genes show the largest distance and standard deviation respectively (Fig. 2, Table S4). Percentages of gaps in each protein-coding gene alignment are below 10%, while percentages of conserved positions range from 39.96% to 60.04% (Fig. 3, Table S5).
Figure 2

Comparisons of average mitochondrial gene p-distances at different taxonomic levels in 38 species of Calyptratae. Black bars indicate the standard deviation.

Figure 3

Percentage of conserved sites (%cons) and percentage of positions experiencing gaps (%gaps) in the alignments of the 13 protein-coding genes.

The nucleotide substitution is estimated (Fig. 4) and there is no or little sign of saturation. The number of transversions and transitions of the 13 PCGs increases with increasing evolutionary distance. For different codon positions, transversions and transitions increase with the p-distance for 1st & 2nd codon positions, while for 3rd codon positions the numbers of transitions show a plateau for p-distance values around 0.15-0.20, but increase after that.
Figure 4

Nucleotide substitution saturation of each 13 PCGs, 1st & 2nd codon positions of PCGs, and 3rd codon positions of PCGs.

Four parallel analyses are performed: subgroup_1 with all genes (i.e., 13 PCGs, 2 rRNA genes, and concatenation of 22 tRNA genes) (subgroup_1_ALL); subgroup_1 with all but tRNA genes (subgroup_1_PCG&rRNA); subgroup_2 with all genes (subgroup_2_ALL), and; subgroup_2 with all but tRNA genes (subgroup_2_PCG&rRNA). For subgroup_1, all analyses place G. pecorum at the basal split of the Calyptratae with strong support (Figs. 5, 6). Topologies of the remaining calyptrates of most datasets show a consistent relationship of (Fanniidae-Muscidae ((Anthomyiidae + Scathophagidae) (remaining Calliphoridae (Sarcophagidae (Oestridae (Pollenia + Tachinidae)))))), except for the unpartitioned Bayesian tree of subgroup_1_PCG&rRNA, and ML tree of subgroup_1 excluding 3rd codon positions of PCGs, which are inferred with the clade of muscoid grade or (Anthomyiidae + Scathophagidae) nested within the Oestroidea.
Figure 5

Phylogeny of subgroup_1, inferred from mitochondrial datasets comprising 13 protein-coding genes, 2 rRNA genes, and concatenation of tRNA genes. A1, Bayesian tree inferred from not partitioned data. A2, Bayesian tree inferred from partitioned data. A3, Bayesian tree inferred from data excluding 3rd codon positions of PCGs. B1, ML tree inferred from not partitioned data. B2, ML tree inferred from partitioned data. B3, ML tree inferred from data excluding 3rd codon positions of PCGs. Numbers at nodes are posterior probabilities (Bayesian trees), and Bootstrap values (ML trees).

Figure 6

Phylogeny of subgroup_1, inferred from mitochondrial datasets comprising 13 protein-coding genes and 2 rRNA genes. A1, Bayesian tree inferred from not partitioned data. A2, Bayesian tree inferred from partitioned data. A3, Bayesian tree inferred from data excluding 3rd codon positions of PCGs. B1, ML tree inferred from not partitioned data. B2, ML tree inferred from partitioned data. B3, ML tree inferred from data excluding 3rd codon positions of PCGs. Numbers at nodes are posterior probabilities (Bayesian trees), and Bootstrap values (ML trees).

For subgroup_2, all analyses infer an identical oestroid family-level topology of (remaining Calliphoridae (Sarcophagidae (Oestridae (Pollenia + Tachinidae)))), except for the Bayesian tree of subgroup_2_PCG&rRNA without 3rd codon positions of PCGs, which places Pollenia together with Oestridae rather than Tachinidae (Figs. 7, 8). At intrafamilial level, relationships within Sarcophagidae (within Sarcophaga) present small changes when the data are partitioned. When the 3rd codon positions of PCGs are excluded, the Muscidae emerge as monophyletic and relationships within Sarcophagidae (within Sarcophaga), Calliphoridae (within subfamily Chrysomyinae), and within Oestridae are different from relationships inferred from datasets with this position included. Each family is moderately or well supported, except for the Oestridae, and Bayesian and ML inference fail to reach an agreement on the relationships within this family, with either Dermatobia or Gasterophilus at the base.
Figure 7

Phylogeny of subgroup_2, inferred from mitochondrial datasets comprising 13 protein-coding genes, 2 rRNA genes, and concatenation of tRNA genes. A1, Bayesian tree inferred from non-partitioned data. A2, Bayesian tree inferred from partitioned data. A3, Bayesian tree inferred from data excluding 3rd codon position of PCGs. B1, ML tree inferred from not partitioned data. B2, ML tree inferred from partitioned data. B3, ML tree inferred from data excluding 3rd codon positions of PCGs. Numbers at nodes are posterior probabilities (Bayesian trees), and Bootstrap values (ML trees).

Figure 8

Phylogeny of subgroup_2, inferred from mitochondrial datasets comprising 13 protein-coding genes and 2 rRNA genes. A1, Bayesian tree inferred from not partitioned data. A2, Bayesian tree inferred from partitioned data. A3, Bayesian tree inferred from data excluding 3rd codon positions of PCGs. B1, ML tree inferred from not partitioned data. B2, ML tree inferred from partitioned data. B3, ML tree inferred from data excluding 3rd codon positions of PCGs. Numbers at nodes are posterior probabilities (Bayesian trees), and Bootstrap values (ML trees).

The set of k-scores calculated by Ktreedist program is shown in Fig. 9. High scores indicate a poor match between the comparison and reference tree. For the PCGs, ND2 and ND5 produce trees that match well with the topology of the reference tree. ND6 and ATP8 genes show the most deviant topologies.
Figure 9

Calyptrate mitochondrial gene k-scores calculated by Ktreedist, measuring overall differences in the relative branch length and topology of the phylogenetic trees generated by single protein-coding genes compared to the combined dataset.

Discussion

Inferred topologies can be recognized as being of two types: (1) for analyses with the two non-calyptrate outgroups, the muscoid families are nested within a paraphyletic Oestroidea in the Bayesian trees based on the unpartitioned dataset with all codon sites, and G. pecorum takes part in the basal split of the Calyptratae; (2) for analyses without the two non-calyptrate outgroups, the Oestroidea are inferred as monophyletic, which is in agreement with previous studies 11, 17. When tRNA genes are included in the matrix in the present study, some node supports decrease (Figs. 7, 8), and relationships within Oestridae vary when data are partitioned (Figs. 7B1, 7B2). Besides, in the analyses without the two non-calyptrate outgroups, topologies within the Calliphoridae change noticeably at genus level when the 3rd codon positions of PCGs are omitted. Taken together, the trees of subgroup_2 excluding tRNA genes but including all codon sites of PCGs are thought to be the most reliable in the present study. Our analyses highlight the importance of taxon sampling in phylogeny reconstruction. Appropriate taxon sampling is very important for accurate phylogenetic estimation 6, but there are some disagreements whether phylogenies are improved by increased taxon sampling. Some have argued that adding taxa would decrease accuracy 83, 84, or at least does not help resolve conflicts 40, while others believe that increased sampling improves accuracy [e.g., 85-89]. The present analysis includes many more species than previous mitogenomic studies of calyptrate flies (e.g., Nelson et al. 44, with 20 species, 13 of which are calliphorids; Zhao et al. 58, with 9 calyptrate species; Ding et al. 61, with 10 calyptrate species) and also includes all available mitogenomes. Our research strongly supports monophyly of Oestroidea, as well as relationships within Sarcophagidae and within Calliphoridae (excluding Pollenia). Although we cannot resolve relationships within Oestridae, bot fly monophyly is well supported in the Bayesian analyses. The different results among the previous studies most likely reflect different coverage of taxa. Our approach, which includes more taxa, yields a better supported topology. It is interesting that all three bot flies have remarkably long branches, with G. pecorum showing the longest terminal branch of any calyptrate in our analyses and H. lineatum the next longest. For all analyses of subgroup_1, G. pecorum is separated from the remaining bot flies and located at the base of the Calyptratae (Figs. 5, 6). With both morphology and biology providing very strong support for bot fly monophyly 29, this placement is very likely an artefact, which is here considered to be caused by long-branch attraction 90, 91 between G. pecorum and the long branches of both outgroups. Improved taxon sampling can help minimise this effect 74, 92, either by a denser taxon sampling 88, 89, 91, or by optimization of outgroup selection 74. We optimized outgroup selection by omitting the non-calyptrate outgroups (subgroup_2) and instead rooted the tree at Euryomma from Fanniidae, as this family has been found to be the sister group to the non-hippoboscoid calyptrates in other studies 11, 17. Following the exclusion of non-calyptrate outgroups, G. pecorum is pulled into the Oestroidea, where it clusters together with the remaining oestrids, and the monophyletic Oestroidea is nested within the muscoid grade, consistent with the widely accepted relationships 11, 17, 19. These results indicate that the overall phylogeny, and in particular the placement of G. pecorum, is influenced by the long branches of the two non-calyptrate outgroups. Similarly, the non-monophyletic Oestroidea inferred by Ding et al. 61 may be an artefact caused by the use of distant outgroups (Tabanidae, Nemestrinidae). Clearly, close attention should be paid to the selection of outgroups, as inappropriate (e.g., very distant) outgroups may cause long-branch attraction and result in erroneous phylogeny reconstructions.

Reliability of single mitochondrial genes

We document two new mitogenomes in the subfamilies Gasterophilinae (of Oestridae) and Paramacronychiinae (of Sarcophagidae), thereby providing an opportunity to reanalyse phylogeny of the Calyptratae in the light of recent results 44, 58 and with a focus on mitogenomic performance. The contribution of each mitochondrial gene to the calyptrate mitogenome phylogeny is estimated by calculating the k-score 82 for each gene. Phylogenetic resolution varies with the contribution of different genes: for rRNA, the lrRNA provides relatively more topological resolution than srRNA; for the protein-coding genes, according to their k-score, trees based on ND2 and ND5 are closest to the reference tree, followed by trees based on ND1, COIII, COI, and ND4L, while trees based on ATP8 and ND6 are farthest from the reference tree. Surprisingly, the widely used CYTB provides relatively little topological resolution. Furthermore, our different analyses for each protein-coding gene suggest that ATP8 and ND6 are faster evolving genes (see Figs. 3, 4). The evidence suggests that ATP8 and ND6 may contribute least in calyptrate phylogeny reconstruction, and that lrRNA, ND2, ND5, ND1, COIII, and COI perform better than other mitochondrial genes. Similar conclusions were reached in other phylogenetic analyses 15, 73. In contrast, the most conservative genes (i.e., concatenation of tRNA genes) decrease some node supports in the present study (Figs. 7A1, 7A2, 7B1, 7B2, Figs. 8A1, 8A2, 8B1, 8B2). Salichos & Rokas 41 similarly concluded that selecting genes with strong phylogenetic signal are very important in accurate reconstruction of ancient divergences.

Data partitioning

The effects of data partitioning and partition schemes on phylogeny reconstruction have been widely investigated [e.g., 93, 94]. Different partitioning schemes may have no effect at a certain level, but can result in strong nodal support for otherwise conflicting topologies at more inclusive levels 14, 93-96, suggesting that partitioning has most effect at deeper phylogenetic levels 5. In this study, data partition has little influence except for minor changes in node supports.

Excluding 3rd codon position of protein-coding genes

The 3rd codon positions are sometimes excluded in phylogeny reconstruction, generally because this codon site can be highly saturated and therefore is considered less informative 58, 93, 97. In the present study, only small topological differences resulted from excluding the 3rd codon positions of PCGs, the major ones being Pollenia clustering together with Oestridae rather than with the Tachinidae, and the Muscidae being monophyletic in the Bayesian tree (Fig. 7A3, Fig. 8A3). However, phylogenetic relationships within families vary, depending upon whether the 3rd codon positions are pruned. Testing nucleotide substitution saturation of PCGs revealed that 3rd codon positions of PCGs in the present study are not saturated, or at most showing partly saturation for transversions (Fig. 4). Interestingly, Caravas et al. 98 estimated the performance of 3rd codon positions of Diptera and found that they still resolved some recent clades within the Calyptratae. Taken together, these results indicate that the 3rd codon positions of PCGs are informative in calyptrate phylogeny reconstruction and should not be pruned.

tRNA genes

Calyptrate phylogenies based on datasets with and without tRNA genes are almost identical, except for slight differences in some node supports and branch lengths, irrespective of the analytic methods (i.e., with or without 3rd codon positions of PCGs, partitioned or not). Genetic divergence analysis also shows that tRNA genes are more conserved than other mitochondrial genes. Kumazawa et al. 99 proposed that tRNA genes may be useful for resolving deep splits that occurred some hundreds of millions of years ago. Since calyptrates are estimated to have appeared 70-55 million years ago 11, 16, 100, the phylogenetic signal of tRNA genes in this group is not strong enough to help resolve calyptrate phylogeny.

Phylogenetic topology

The Calyptratae are considered “one of the most surely grounded monophyletic groups within the Schizophora” (101, see also 27, 102) based on morphological data, and phylogenies from molecular data have been in strong agreement 11. The group has been subject to very extensive phylogenetic analyses, using both morphological and variable amounts of molecular data, e.g., 17-22, 26, 27, 29, 31, 33, 34, 37, 38, 44, 58, 61, 103-115. However, some relationships within the Calyptratae remain controversial, primarily through differences in the selection of molecular markers, and the sample of taxa used for phylogeny reconstruction. The favoured tree topologies in the present study (Figs. 7, 8) differ in some important respects from the previous study also obtained using whole mitogenomes 44, and that obtained using a small data set of two mitochondrial and two nuclear genes 26. The relationships at the family level are almost identical across all three studies, except for the placement of Sarcophagidae in the analysis using whole mitogenomes. In the present study, Sarcophagidae is sister group to the clade (Oestridae (Tachinidae + Pollenia)), while Nelson et al. 44 place the Sarcophagidae and Calliphoridae (except for Pollenia) as sister groups. This difference may be due to the limited representation of taxa from the Sarcophagidae: Nelson et al. 44 focus on the phylogeny within the Calliphoridae, and so the Sarcophagidae are represented by a single species, compared with 9 species representing two subfamilies in the present study. A noticeable difference of these studies lies at the subfamily level within the Calliphoridae. With 11 subfamilies 33, the non-monophyletic Calliphoridae represent a considerable challenge for calyptrate phylogeny reconstruction 26, 33, 44. Calliphoridae are non-monophyletic in the present study, like in previous studies 11, 17, 26, 44, and combining these results the clade (Chrysomyinae, (Calliphorinae + Luciliinae)) appears to form a robust relationship, and both Mesembrinellinae and Polleninae are phylogenetically closer to the Tachinidae than to other calliphorids and would as such warrant separate status as valid families. The placement of Sarcophagidae and Rhiniidae (formerly Calliphoridae: Rhiniinae) by Kutty et al. 17 was not well supported in Marinho et al. 26, highlighting how increasing the number of molecular markers helps produce more robust phylogenies.

Conclusion

The mitogenome in general provides informative molecular markers in calyptrate phylogenetic research. Partitioning data by genes does not change the phylogenetic topology at the family level but improves some node supports. Similarly, the topology is hardly changed by including conservative tRNA genes, except for some slightly reduced node supports. Taxon sampling plays an important role in calyptrate phylogeny reconstruction, and more stable family-level relationships can be inferred by increased coverage. More taxa and nuclear genes should be selected in calyptrate phylogeny reconstruction in the future, in order to break long branches and improve resolution. Family-level relationships within the Oestroidea are poorly understood and difficult to resolve. One of the more challenging aspects of resolving oestroid relationships is the non-monophyly of the traditional Calliphoridae 33. Another is the position of the Oestridae, which is a small family of less than 200 species 116, 117, the larvae of which are obligate parasites of mammals. Solving the issue of bot fly ancestry is closely connected to establishing the phylogenetic relationships for the subgroups of the traditional Calliphoridae. Supplementary Tables and Figures. Click here for additional data file.
  59 in total

Review 1.  Mitochondrial genomes: anything goes.

Authors:  Gertraud Burger; Michael W Gray; B Franz Lang
Journal:  Trends Genet       Date:  2003-12       Impact factor: 11.639

2.  Large-scale phylogenies and measuring the performance of phylogenetic estimators.

Authors:  J Kim
Journal:  Syst Biol       Date:  1998-03       Impact factor: 15.683

3.  Is it better to add taxa or characters to a difficult phylogenetic problem?

Authors:  A Graybeal
Journal:  Syst Biol       Date:  1998-03       Impact factor: 15.683

4.  Hexapod origins: monophyletic or paraphyletic?

Authors:  Francesco Nardi; Giacomo Spinsanti; Jeffrey L Boore; Antonio Carapelli; Romano Dallai; Francesco Frati
Journal:  Science       Date:  2003-03-21       Impact factor: 47.728

5.  Increased taxon sampling is advantageous for phylogenetic inference.

Authors:  David D Pollock; Derrick J Zwickl; Jimmy A McGuire; David M Hillis
Journal:  Syst Biol       Date:  2002-08       Impact factor: 15.683

6.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

7.  Molecular evolution in Drosophila and the higher Diptera II. A time scale for fly evolution.

Authors:  S M Beverley; A C Wilson
Journal:  J Mol Evol       Date:  1984       Impact factor: 2.395

8.  The evolution of myiasis in blowflies (Calliphoridae).

Authors:  Jamie R Stevens
Journal:  Int J Parasitol       Date:  2003-09-15       Impact factor: 3.981

9.  The mitochondrial genome of the blowfly Chrysomya chloropyga (Diptera: Calliphoridae).

Authors:  Ana Carolina M Junqueira; Ana Cláudia Lessinger; Tatiana Teixeira Torres; Felipe Rodrigues da Silva; André Luiz Vettore; Paulo Arruda; Ana Maria L Azeredo Espin
Journal:  Gene       Date:  2004-09-15       Impact factor: 3.688

10.  The Phylogeny and Evolutionary Timescale of Muscoidea (Diptera: Brachycera: Calyptratae) Inferred from Mitochondrial Genomes.

Authors:  Shuangmei Ding; Xuankun Li; Ning Wang; Stephen L Cameron; Meng Mao; Yuyu Wang; Yuqiang Xi; Ding Yang
Journal:  PLoS One       Date:  2015-07-30       Impact factor: 3.240

View more
  18 in total

Review 1.  A world review of reported myiases caused by flower flies (Diptera: Syrphidae), including the first case of human myiasis from Palpada scutellaris (Fabricius, 1805).

Authors:  Celeste Pérez-Bañón; Cecilia Rojas; Mario Vargas; Ximo Mengual; Santos Rojo
Journal:  Parasitol Res       Date:  2020-01-31       Impact factor: 2.289

2.  Mitogenomes provide new insights of evolutionary history of Boreheptagyiini and Diamesini (Diptera: Chironomidae: Diamesinae).

Authors:  Xiao-Long Lin; Zheng Liu; Li-Ping Yan; Xin Duan; Wen-Jun Bu; Xin-Hua Wang; Chen-Guang Zheng
Journal:  Ecol Evol       Date:  2022-05-24       Impact factor: 3.167

3.  Mitogenome-wise codon usage pattern from comparative analysis of the first mitogenome of Blepharipa sp. (Muga uzifly) with other Oestroid flies.

Authors:  Debajyoti Kabiraj; Hasnahana Chetia; Adhiraj Nath; Pragya Sharma; Ponnala Vimal Mosahari; Deepika Singh; Palash Dutta; Kartik Neog; Utpal Bora
Journal:  Sci Rep       Date:  2022-04-29       Impact factor: 4.996

4.  First complete mitogenomes of Diamesinae, Orthocladiinae, Prodiamesinae, Tanypodinae (Diptera: Chironomidae) and their implication in phylogenetics.

Authors:  Chen-Guang Zheng; Xiu-Xiu Zhu; Li-Ping Yan; Yuan Yao; Wen-Jun Bu; Xin-Hua Wang; Xiao-Long Lin
Journal:  PeerJ       Date:  2021-05-06       Impact factor: 2.984

5.  The blowflies of the Madeira Archipelago: species diversity, distribution and identification (Diptera, Calliphoridaes. l.).

Authors:  Catarina Prado E Castro; Krzysztof Szpila; Anabel Martínez-Sánchez; Isamberto Silva; Artur R M Serrano; Mário Boieiro
Journal:  Zookeys       Date:  2016-11-21       Impact factor: 1.546

6.  First fossil of an oestroid fly (Diptera: Calyptratae: Oestroidea) and the dating of oestroid divergences.

Authors:  Pierfilippo Cerretti; John O Stireman; Thomas Pape; James E O'Hara; Marco A T Marinho; Knut Rognes; David A Grimaldi
Journal:  PLoS One       Date:  2017-08-23       Impact factor: 3.240

7.  Comparative Mitogenomic Analyses of Praying Mantises (Dictyoptera, Mantodea): Origin and Evolution of Unusual Intergenic Gaps.

Authors:  Hong-Li Zhang; Fei Ye
Journal:  Int J Biol Sci       Date:  2017-02-25       Impact factor: 6.580

8.  Diptera of Canada.

Authors:  Jade Savage; Art Borkent; Fenja Brodo; Jeffrey M Cumming; Douglas C Currie; Jeremy R deWaard; Joel F Gibson; Martin Hauser; Louis Laplante; Owen Lonsdale; Stephen A Marshall; James E O'Hara; Bradley J Sinclair; Jeffrey H Skevington
Journal:  Zookeys       Date:  2019-01-24       Impact factor: 1.546

9.  Gasterophilus flavipes (Oestridae: Gasterophilinae): A horse stomach bot fly brought back from oblivion with morphological and molecular evidence.

Authors:  Xin-Yu Li; Thomas Pape; Dong Zhang
Journal:  PLoS One       Date:  2019-08-12       Impact factor: 3.240

10.  Mitochondrial DNA-Based Identification of Forensically Important Flesh Flies (Diptera: Sarcophagidae) in Thailand.

Authors:  Chutharat Samerjai; Kabkaew L Sukontason; Narin Sontigun; Kom Sukontason; Tunwadee Klong-Klaew; Theeraphap Chareonviriyaphap; Hiromu Kurahashi; Sven Klimpel; Judith Kochmann; Atiporn Saeung; Pradya Somboon; Anchalee Wannasan
Journal:  Insects       Date:  2019-12-18       Impact factor: 2.769

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.