Literature DB >> 16854601

Dissecting the ancient rapid radiation of microgastrine wasp genera using additional nuclear genes.

Jonathan C Banks1, James B Whitfield.   

Abstract

Previous estimates of a generic level phylogeny for the ubiquitous parasitoid wasp subfamily Microgastrinae (Hymenoptera) have been problematic due to short internal branches deep in the phylogeny. These short branches might be attributed to a rapid radiation among the taxa, the use of genes that are unsuitable for the levels of divergence being examined, or insufficient quantity of data. We added over 1200 nucleotides from four nuclear genes to a dataset derived from three genes to produce a dataset of over 3000 nucleotides per taxon. While the number of well-supported short branches in the phylogeny increased, we still did not obtain strong bootstrap support for every node. Parametric and nonparametric bootstrap simulations projected that an enormous, and likely unobtainable, amount of data would be required to get bootstrap support greater than 50% for every node. However, a marked increase in the number of well-supported nodes was seen when we conducted a Bayesian analysis of a combined dataset generated from morphological characters added to the seven gene dataset. Our results suggest that, in some cases, combining morphological and genetic characters may be the most practical way to increase support for short branches deep in a phylogeny.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16854601      PMCID: PMC7129091          DOI: 10.1016/j.ympev.2006.06.001

Source DB:  PubMed          Journal:  Mol Phylogenet Evol        ISSN: 1055-7903            Impact factor:   4.286


Introduction

Uncertainty in phylogenetic estimation at higher taxonomic levels is inevitable, due to the confounding effects of factors that may indicate alternative patterns. These factors include the convergence of morphological characters from similar ecological forces, and multiple substitutions in genetic data (“saturation”). Convergence and saturation often result in low bootstrap support values, poor Bremer decay indices or low Bayesian posterior probability values for some branches (Swofford et al., 1996). However, poor branch support can also be caused by failure to use a sufficient quantity of data (Fishbein et al., 2001), use of data that are inappropriate for the level of divergence that is being analysed (de Queiroz et al., 1995), or rapid evolutionary radiations among taxa (Fishbein et al., 2001). Often, it is difficult to know which factors are operating in any particular case. Although phylogenies without strong support for all branches are sometimes well accepted, there are situations, such as the study of cophylogenetic relationships between hosts and associates when well-supported phylogenies are important. For example, reconciliation analysis (Page, 1995), the method most commonly used to examine cophylogenetic relationships (Brooks and McLennan, 2003), infers cophylogenetic history from the topology of the host and associate phylogenies and thus requires robust phylogenies to reconstruct the evolutionary history of the relationship between hosts and associates. Other situations requiring robust phylogenies include the forensic use of phylogenies to identify the source of infections such as human immunodeficiency virus (Korber et al., 2000, Rambaut et al., 2001, Worobey et al., 2004) and severe acute respiratory syndrome (SARS) (Guan et al., 2003). One example of poor support possibly caused by several factors occurs in the phylogenies estimated for microgastrine wasps (Whitfield et al., 2002). Microgastrinae, a subfamily of Braconidae (Hymenoptera), is a speciose group with approximately 1400 described species in over 55 genera, and it has been estimated that there may actually be 5000 to 10,000 species worldwide (Whitfield, 1997b, Whitfield et al., 2002). Microgastrine wasps lay their eggs on lepidopteran larvae, and the wasp larvae develop while consuming the tissues of the lepidopteran larvae (Whitfield, 1997b, Whitfield et al., 2002). Many microgastrine wasp species have been transferred around the world to aid in the control of crop pests (Whitfield, 1997b, Whitfield et al., 2002). All microgastrine wasps have inherited an association with polydnaviruses, which are incorporated into the wasp genomes and help the wasp larvae evade lepidopteran immune systems (Whitfield and Asgari, 2003). It has therefore been of considerable coevolutionary interest to compare the phylogenetic histories of the wasps and those of the viruses. A robust phylogenetic framework is essential for producing a useful and informative classification for this large, economically and ecologically important insect group. Previous work that estimated a phylogeny for the microgastrines from 2300 nucleotides from three genes (16S, 28S and COI) and 53 morphological characters found a tree with low bootstrap support for many branches (Mardulyn and Whitfield, 1999, Whitfield et al., 2002). The poorly supported branches in the microgastrine phylogenies are mainly short internal branches (Mardulyn and Whitfield, 1999, Whitfield et al., 2002). It was proposed that the short branches might have arisen from a rapid radiation as the microgastrines colonised new lepidopteran host species (Mardulyn and Whitfield, 1999), which themselves may have been diversifying in the early Tertiary (Grimaldi, 1999, Whitfield, 2002). Support for the rapid radiation of microgastrines was bolstered by the fact that the same branches were estimated to be short from multiple data sources. However, it was also acknowledged that the poorly supported short branches may have been due to insufficient data or the use of genes with rates of divergence that are inappropriate for the levels of divergence between the taxa (Mardulyn and Whitfield, 1999). Here we present analyses of data from two mitochondrial and five nuclear genes, including the genetic data (16S, 28S and COI) and 53 morphological characters analysed by Whitfield et al. (2002). These analyses show that completely robustly supported phylogenies for Microgastrinae are unlikely to be estimated from genetic data alone. We use parametric and nonparametric bootstrapping of simulated datasets to estimate how much data would be required to resolve the phylogeny with every branch having nonparametric bootstrap values greater than 50%. The simulations show that unless an impractically large amount of molecular data is obtained, the use of morphological characters may be necessary to produce a completely robustly supported phylogeny that can be used to examine cophylogenetic relationships between microgastrine wasps and polydnaviruses.

Methods

Wasps were stored in 100% ethanol at 4 °C until genomic DNA could be extracted. Specimens were identified by JBW to genus, and to species where possible, using morphological characters and often also host data. Taxa from which sequences were obtained are listed in Table 1 . Because we had few sequences from Apanteles canarsiae we pooled sequences for A. canarsiae with A. galleriae and the resulting “chimera” is labelled Apanteles sp. in the phylogenies. Whole wasps were macerated using mini-mortar and pestles and the DNA extracted using Qiagen DNeasy tissue extraction kits. Polymerase chain reactions (PCR) were carried out with an Eppendorf Mastercycler thermocycler. PCR consisted of 2.5 μL of Hotmaster buffer (Eppendorf), 1.2 μL of dNTPs (8 mM), 2.5 μL of each primer (2.5 μM), 0.125 μL Hotmaster Taq (5 units/μL, Eppendorf), 0.8 μL of DNA and 15.375 μL water. PCRs consisted of an initial denaturing step of 94 °C for 2 min, followed by 35 cycles of 94 °C for 20 s, 20 s at the temperatures listed in Table 2 , 65 °C for 40 to 60 s depending on the size of the target region, and a final step of 65 °C for 5 min Primer sequences are listed in Table 2. A negative control was incorporated in each amplification round using water rather than DNA. PCR products were purified using Qiagen QIAquick kits. Sequencing was carried out on an ABI 3730 capillary sequencer.
Table 1

Taxa sequenced and Genbank Accession numbers

16S28SArginine kinase exon 1Arginine kinase exon 2COIEF1αOpsin exon 1Opsin exon 2Wingless
Alphomelon sp.AF102752AF102732DQ538920DQ538866AF102707DQ538631DQ538754DQ538696DQ538574
Apanteles canarsiaeAF102750AF102728
Apanteles galleriaeDQ538812DQ538632DQ538755DQ538697DQ538575
Apanteles nephoptericisAF102763AF102745DQ538921DQ538867DQ538813DQ538633DQ538756DQ538698DQ538576
CARDIOCHILES sp.DQ538553DQ538961DQ538901DQ538843DQ538672DQ538795DQ538737DQ538613
CHELONUS sp.DQ538554AJ535956DQ538902DQ538844DQ538673DQ538796DQ538738DQ538614
Choeras sp.DQ538526AY044218DQ538922DQ538868DQ538634DQ538757DQ538699DQ538577
Cotesia congregataDQ538527DQ538975DQ538923DQ538869DQ538815DQ538635DQ538758DQ538700DQ538578
Cotesia electraeDQ538529AJ535938DQ538924DQ538870DQ538817DQ538637DQ538760DQ538702
Cotesia flaviconchaeDQ538531DQ538978DQ538926DQ538872DQ538819DQ538639DQ538762DQ538704DQ538582
Cotesia hyphantriaeDQ538532DQ538979DQ538927DQ538873DQ538820DQ538640DQ538763DQ538705DQ538583
Cotesia melanoscelaDQ538533DQ538980DQ538928DQ538874DQ538821DQ538641DQ538764DQ538706DQ538584
Cotesia obscuricornisDQ538534DQ538981DQ538929DQ538875DQ538822DQ538642DQ538765DQ538707DQ538585
Cotesia rubeculaDQ538535DQ538982DQ538930DQ538876DQ538823DQ538643DQ538766DQ538708DQ538586
Cotesia sesamiaeAF110827AJ535952DQ538645DQ538768DQ538710DQ538588
Deuterixys rimulosaDQ538537AYO44219DQ538931DQ538877DQ538646DQ538769DQ538711DQ538589
Diolcogaster bakeriDQ538538AJ535954DQ538647DQ538770DQ538712DQ538590
Diolcogaster schizuraeAF102759AF102741DQ538932DQ538878DQ538825DQ538648DQ538771DQ538713DQ538591
Dolichogenidea lacteicolorAF102761AF102742DQ538933DQ538879DQ538826DQ538649DQ538772DQ538714DQ538592
EPSILOGASTER sp.DQ538555DQ538997DQ538955DQ538903DQ538845DQ538674DQ538797DQ538739DQ538615
Fornicia sp.AY044195AY044210DQ538650DQ538773DQ538715
Glyptapanteles indiensisAF102757AF102738DQ538934DQ538880DQ538827DQ538651DQ538774DQ538716DQ538593
Glyptapanteles porthetriaeAF102758AF102739DQ538935DQ538881DQ538828DQ538652DQ538775DQ538717DQ538594
Hypomicrogaster sp. Costa RicaDQ538539DQ538936DQ538882DQ538829DQ538776DQ538718DQ538595
Hypomicrogaster ecdytolophaeAF102756AF102757AF102712DQ538653DQ538777DQ538719DQ538596
Microgaster canadensisU98154AF102733DQ538937DQ538883AF102708DQ538654DQ538778DQ538720DQ538597
Microplitis demolitorDQ538540DQ538985DQ538938DQ538884DQ538830DQ538655DQ538779DQ538721DQ538598
MIRAX sp.DQ538556AF102747DQ538956DQ538846DQ538675DQ538798DQ538740DQ538616
Parapanteles sp.AF102753AF102734DQ538939DQ538885DQ538831DQ538656DQ538780DQ538722DQ538599
PHANEROTOMA sp.DQ538557DQ538998DQ538957DQ538904DQ538847DQ538676DQ538617
Pholetesor bedelliaeU68153AF102740DQ538940DQ538886AF102715DQ538657DQ538781DQ538723DQ538600
Prasmodon sp. 1DQ538541DQ538986DQ538941DQ538887DQ538832DQ538658DQ538782DQ538724DQ538601
Prasmodon sp. 2AF102748AF102725DQ538942DQ538888AF102700DQ538659DQ538783DQ538725DQ538602
Promicrogaster sp. 1DQ538542DQ538987DQ538943DQ538889DQ538660DQ538784DQ538726DQ538603
Promicrogaster sp. 2DQ538543DQ538988DQ538944DQ538890DQ538833DQ538661DQ538785DQ538727DQ538604
Pseudapanteles sp.DQ538545DQ538990DQ538945DQ538892DQ538835DQ538663DQ538787DQ538729
Rhygoplitis sp. 1DQ538546DQ538991DQ538946DQ538893DQ538836DQ538664DQ538788DQ538730DQ538606
Rhygoplitis sp. 2DQ538547DQ538992DQ538947DQ538894DQ538837DQ538789DQ538731DQ538607
Sendaphne sp.DQ538548DQ538993DQ538948DQ538895DQ538838DQ538666DQ538790DQ538732DQ538608
Snellenius sp. 1AF102749AF102776DQ538949DQ538896DQ538839DQ538667DQ538791DQ538733DQ538609
Snellenius sp. 2DQ538549DQ538994DQ538950DQ538897DQ538840DQ538668DQ538792DQ538734DQ538610
TOXONEURON NIGRICEPSU68151AF029120DQ538905AF102724DQ538677DQ538800DQ538742DQ538618
Venanides sp.DQ538550DQ538951DQ538898DQ538793DQ538735
Venanus sp.DQ538551DQ538995DQ538952DQ538899DQ538841DQ538670DQ538794DQ538736DQ538611
VENTURIA CANESCENSDQ538560DQ539001DQ538960DQ538908DQ538851DQ538680DQ538801DQ538743DQ538619
Xanthomicrogaster sp.DQ538552DQ538996DQ538953DQ538900DQ538671DQ538612

Where accession numbers are absent, we failed to get sequences for that region of that taxon. Taxa in capitals are outgroups.

Table 2

Primers used in this study

GenePrimer nameSequenceAnnealing temperature (°C)Reference
16S52–57
 Forward16S outerCTTATTCAACATCGAGGTC(Whitfield, 1997a)
 Reverse16SWbCACCTGTTTATCAAAACAT(Dowton and Austin, 1994)
28S55–62
 Forward28SFCACCTGTTTATCAAAAACAT(Mardulyn and Whitfield, 1999)
 Reverse28SRTAGTTCACCATCTTTCGGGTCCC(Mardulyn and Whitfield, 1999)
Arginine kinase47–50
 ForwardF2GACAGCAARTCTCTGCTGAAGAA(Kawakita et al., 2003)
 ForwardintFGTNTCNACYCGTGRAGATGYGGThis study
 ReverseR2GGTYTTGGCATCGTTGTGGTAGATAC(Kawakita et al., 2003)
 ReverseintRAGRGTRTCRRCRTCDCCRAAGTCThis study
COI50–53
 ForwardLCO1490GGTCAACAAATCATAAAGATATTGG(Folmer et al., 1994)
 ReverseHCO2198TAAACTTCAGGGTGACCAAAAAATCA(Folmer et al., 1994)
EF1a47–52
 ForwardEF1A1FAGATGGGYAARGGTTCCTTCAA(Belshaw and Quicke, 1997)
 ReverseEF1A1RAACATGTTGTCDCCGTGCCATCC(Belshaw and Quicke, 1997)
Rhodopsin47–55
 ForwardOpsFor2GGATGTASCTCCATTTGGTCThis study
 ReverseOps3′2AVHGATGCRACRTTCATTTTCTThis study
Wingless47–53
 ForwardWg1GARTGYAARTGYCAYGGYATGTCTGG(Brower and DeSalle, 1998)
 ReverseWg2ACTICGCRCACCARTGGAATGTRCA(Brower and DeSalle, 1998)
Taxa sequenced and Genbank Accession numbers Where accession numbers are absent, we failed to get sequences for that region of that taxon. Taxa in capitals are outgroups. Primers used in this study

Gene selection

The three genes, 16S, COI and 28S, originally used for this group were selected for their broad use among different groups of insects, ease of amplification across all taxa and because they provide resolution at several phylogenetic levels. 16S has been used to resolve intra-family relationships in Hymenoptera (Whitfield and Cameron, 1998). The nuclear gene 28S (including the D2 and D3 expansion loops) has provided a strong signal for intermediate and moderately deep levels in the phylogeny (Belshaw et al., 1998, Belshaw and Quicke, 1997, Cameron and Mardulyn, 2001, Cameron and Williams, 2003, Dowton and Austin, 1998, Mardulyn and Whitfield, 1999, Michel-Salzat and Whitfield, 2004, Whitfield, 2002, Whitfield et al., 2002, Wiegmann et al., 2003) and retains at least some signal at the species level. COI has been found to saturate quickly at the third position while remaining quite “conserved” at the first two positions due to a small number of sites free to vary (Mardulyn and Whitfield, 1999). Thus, it has proven highly useful at lower levels to detect species boundaries (Hebert et al., 2003a, Hebert et al., 2004, Hebert et al., 2003b) but has tended to fail at higher levels, especially in divergence time estimation studies (e.g., Whitfield, 2002). We added sequences from four nuclear genes to the dataset of Whitfield et al. (2002). Arginine kinase has been used to resolve bee relationships at species and tribal level (Danforth et al., 2005, Kawakita et al., 2003). The nuclear gene EF1α has been used extensively to resolve lepidopteran relationships at intermediate phylogenetic levels (Cho et al., 1995, Friedlander et al., 1998, Mitchell et al., 1997, Mitchell et al., 2000, Mitchell et al., 2006, Wiegmann et al., 2000) and has been used recently in a number of studies on Hymenoptera (Cameron, 2003, Danforth et al., 2004, Danforth and Ji, 1998, Kawakita et al., 2003, Leys et al., 2002, Michel-Salzat et al., 2004). The gene occurs in at least two divergent copies in most Hymenoptera (originally reported by Danforth and Ji, 1998), but these copies are easily separated by PCR once taxon-specific primers are developed. We used primers that amplify the F2 copy in bumble bees (Kawakita et al., 2003). Long wavelength rhodopsin (opsin) has been used to resolve relationships among bees (Mardulyn and Cameron, 1999), and is especially useful for intermediate levels of phylogeny (from species up to intergeneric and tribal levels—Cameron and Mardulyn, 2001, Cameron and Mardulyn, 2003, Cameron and Williams, 2003, Danforth et al., 2004, Kawakita et al., 2003, Michel-Salzat et al., 2004, Michel-Salzat and Whitfield, 2004), despite early reservations (Ascher et al., 2001). Wingless is less variable than many mtDNA genes, but more variable than most of the other nuclear protein-coding genes we sequenced in this study. Thus wingless tends to be useful at the generic level rather than at higher hierarchical levels (Brower and DeSalle, 1998).

Alignment

Sequences were aligned with Clustal X (Thompson et al., 1997). Alignment of COI, EF1α, and wingless sequences was straightforward as there were few insertions or deletions. The alignment of arginine kinase and opsin sequences was straightforward once an intron in each gene had been removed. There were several variable length-regions of 16S and 28S where it was difficult to assign homology. Regions of 16S and 28S that could not be aligned unambiguously were omitted from the analysis.

Testing for incongruence

We tested for incongruence between genes using the incongruence length difference test (Farris et al., 1994, Farris et al., 1995) implemented in PAUP*4.0b10 (Swofford, 2002) as the partition homogeneity test. We conducted 100 replicates and compared genes in both a pairwise manner and each gene to the rest of the combined sequence data with that gene excluded. Parsimony uninformative characters were removed before each test. We also tested the data for stationarity (equal nucleotide proportions between taxa) using the χ 2 test in PAUP*.

Phylogeny estimation

We used PAUP* to conduct maximum parsimony (MP), maximum likelihood (ML) and LogDet phylogenetic analyses. LogDet is a distance based method that is less affected than MP when taxa differ in their base frequencies (Lockhart et al., 1994). The Akaike Information Criterion as implemented in ModelTest 3.06 (Posada, 2000, Posada and Crandall, 1998) was used to select the model and estimate model parameters (GTR + gamma + proportion of invariable sites (Rodríguez et al., 1990, Tavaré, 1986, Yang et al., 1994); base frequencies A  = 0.3166, C  = 0.1589, G  = 0.1819; rate matrix AC = 1.7463, AG = 11.0123, AT = 8.9781, CG = 2.1939, CT = 14.8349; γ  = 0.6963; proportion of invariable sites = 0.3614) from all seven genes combined for the ML analysis. MrBayes 3.1 (Huelsenbeck and Ronquist, 2001, Ronquist and Huelsenbeck, 2003) was used to generate Bayesian estimates of microgastrine phylogeny. We used a mixed model approach with eight partitions corresponding to the morphological characters and the seven gene regions. The models used for each of the seven genes were GTR (Tavaré, 1986) plus a proportion invariable sites plus gamma (Rodríguez et al., 1990, Yang et al., 1994). MrBayes estimated the model parameters from the data using one cold and three heated Markov chains. The Monte Carlo Markov chain length was 2,000,000 generations and we sampled the chain every 100 generations. We discarded the first 5000 samples as burnin and thus estimated our phylogeny and posterior probabilities from a consensus of the last 15,000 sampled trees.

Assessing the effect of branch length on bootstrap support

To compare the branch lengths of branches with bootstrap support greater than 50% to branches with less than 50% support, we reduced our data set to the 27 taxa for which we had data for all seven genes and estimated the phylogeny for the 27 taxa under the MP criterion. The MP analysis found three most parsimonious trees. We then loaded one of the three most parsimonious trees found from the MP analysis into PAUP* as a constraint tree and used MP to estimate the branch length for the constraint tree for individual genes. Branches for the constraint tree were categorised as either bootstrap support >50% or <50% and branch lengths for the branches for each gene were recorded and compared using a Student’s t test in Systat 9 (SPSS, 1998).

Assessing the amount of data needed

Pseudoreplicate datasets one and a half, two, three, four, five and 10 times the size of our dataset were constructed from the aligned data by altering the number of characters re-sampled in the nonparametric bootstrap command of PAUP*. These pseudoreplicate datasets were then analysed under the MP criterion and bootstrapped to estimate the amount of data that would be required to estimate phylogenies with all nodes having bootstrap support greater than 50%. Pseudoreplicate-data sets one and a half, two, three, four, five and 10 times the size of our dataset were also generated using a parametric approach with Seq-Gen (Rambaut and Grassly, 1997) from the ML equation calculated from the original data by Modeltest 3.06. Nonparametric bootstrap values were then obtained with PAUP* from the MP trees estimated from the data sets produced by Seq-Gen. This approach assumes that the data added will have similar properties to the data already obtained. This seems a valid assumption given that the genes we sequenced cover a range of evolutionary rates.

Assessing the effect of number of taxa

To assess the effect of altering the number of taxa, we randomly deleted taxa from our actual dataset to give datasets containing 15, 20, 25, 30, 35 and 40 taxa. We then obtained nonparametric bootstrap values for the branches from 100 replicates using MP.

Results

We added 1248 nucleotides to the previously published dataset and analysed a total of 3031 nucleotides (including gaps). We used primers that bound to a more conserved region of COI and thus reduced the previously published COI sequences (Mardulyn and Whitfield, 1999) from 1235 nucleotides to 419 nucleotides that were homologous with our sequences. Levels of variation between species and genera for the seven genes differed with 28S, arginine kinase, EF1α, opsin and wingless diverging more slowly than 16S and COI, which quickly saturated at the generic level (Fig. 1 ).
Fig. 1

Average uncorrected pairwise distances between microgastrine species and genera, and braconid subfamilies for the seven genes sequenced. Arginine kinase (Argk), elongation factor 1 alpha (EF1α), opsin and wingless are the genes added to the original dataset of Mardulyn and Whitfield, 1999, Whitfield, 2002.

Average uncorrected pairwise distances between microgastrine species and genera, and braconid subfamilies for the seven genes sequenced. Arginine kinase (Argk), elongation factor 1 alpha (EF1α), opsin and wingless are the genes added to the original dataset of Mardulyn and Whitfield, 1999, Whitfield, 2002. The ILD test found 17 of the 21 pairwise comparisons of the genes were significantly incongruent (P  ⩽ 0.01). The four comparisons that were not significantly incongruent were 28S to arginine kinase, EF1α and wingless, and EF1α to 16S. Six of the seven comparisons of individual genes to the rest of the combined molecular data (with each individual gene excluded) revealed significant differences (P  = 0.01). The exception was arginine kinase, which was not significantly heterogeneous with the combined data (P  = 0.6). The χ 2 test of sequence stationarity in PAUP* found there was significant heterogeneity in nucleotide proportions. A nonsignificant result was obtained when the third positions of codons in protein coding genes were excluded. A maximum parsimony analysis of all seven genes for all taxa found three equally parsimonious trees of length 6861; consistency index, excluding uninformative characters = 0.30; retention index = 0.44 from 3031 characters of which 1494 were constant and 1207 were parsimony informative. The strict consensus tree of the three most parsimonious trees is shown in Fig. 2 .
Fig. 2

Strict consensus of the three most parsimonious trees (tree length 6861; consistency index, excluding uninformative characters, 0.30; retention index 0.44) obtained from a MP analysis of 3031 nucleotides from seven genes. Numbers above the branches are percentage bootstrap support values >50 (from 100 replicates).

Strict consensus of the three most parsimonious trees (tree length 6861; consistency index, excluding uninformative characters, 0.30; retention index 0.44) obtained from a MP analysis of 3031 nucleotides from seven genes. Numbers above the branches are percentage bootstrap support values >50 (from 100 replicates). A MP analysis found a single most parsimonious tree when third positions were excluded (result not shown) that was broadly congruent with the MP tree estimated from all positions (Fig. 2). Excluding third positions did not appreciably alter bootstrap support values obtained with MP, as 15 branches still had bootstrap support of less than 50%. Deleting third positions altered relationships found by MP within more recently diverged clades, but most deep and mid level relationships were not altered. The exception was that Choeras was placed as sister to Sendaphne and Promicrogaster, rather than with Fornicia and Deuterixys, when third positions were excluded, a more reasonable result based on morphology. Both MP and Bayesian methods (Fig. 2, Fig. 3 ) supported Microgastrinae as a monophyletic group. Both methods found broadly similar relationships. However, the placement of Fornicia differed greatly depending on the method of analysis. Maximum parsimony placed Fornicia as a sister taxon to Deuterixys rimulosa whereas the Bayesian analysis placed Fornicia in a clade with Hypomicrogaster and Parapanteles. We were unable to get sequences from three genes for Fornicia, and it is possible that missing data are causing the two methods to differ in their placement of Fornicia.
Fig. 3

Majority rule consensus tree from 15,000 trees estimated from seven genes using MrBayes 3.1. Broken lines represent branches with posterior probabilities of less than 0.9.

Majority rule consensus tree from 15,000 trees estimated from seven genes using MrBayes 3.1. Broken lines represent branches with posterior probabilities of less than 0.9. The numbers of branches with high or low levels of support differed slightly for the phylogenies estimated using MP and Bayes. Maximum parsimony had 14 branches with bootstrap support of less than 50%, whereas the Bayesian tree had 10 branches with posterior probability values of less than 0.9. There was no obvious trend for branches to have higher Bayesian posterior probabilities than MP bootstrap support values. Some branches were supported with posterior probabilities higher than 0.9 but had low bootstrap support, while some branches that had posterior probabilities much less than 0.9 received high bootstrap support values. For example, grouping Hypomicrogaster ecdytolophae and Parapanteles sp. as sisters received a bootstrap value of 79% but had a posterior probability of 0.56. It must of course be kept in mind that in these comparisons, both the branch support measure and the optimality criterion for tree estimation differ. The LogDet analysis found a single tree (result not shown) that was broadly congruent with both the MP and Bayesian trees. Nine nodes had bootstrap support of less than 50% (similar to the other methods). Nodes in the LogDet tree that differed from the MP and Bayesian trees did not have high levels of bootstrap support.

The effect of branch length on bootstrap support

Branches with bootstrap support of <50% in phylogenies obtained from the individual genes were significantly shorter than branches with bootstrap support >50% (meanbootstrap <50%  = 5.5, SE = 0.7; meanbootstrap >50%  = 10.5, SE = 0.7; Student’s t  = 5.21, df = 127.3, P  < 0.001) in the phylogeny (Fig. 4 ) estimated for the 27 taxa for which we had sequences for all seven genes.
Fig. 4

Maximum parsimony tree estimated for the taxa for which we have sequences for all seven genes. The horizontal length of each colour indicates the number of parsimony informative changes for each gene on the branch (blue = 16S, red = 28S, light-blue = arginine kinase, green = COI, grey = EF1α, yellow = opsin, black = wingless).

Maximum parsimony tree estimated for the taxa for which we have sequences for all seven genes. The horizontal length of each colour indicates the number of parsimony informative changes for each gene on the branch (blue = 16S, red = 28S, light-blue = arginine kinase, green = COI, grey = EF1α, yellow = opsin, black = wingless).

Simulations

The nonparametric approach using PAUP* to resample more characters from the original dataset found that the number of branches in the phylogeny with less than 50% bootstrap support reduced reasonably quickly until the dataset contained approximately 12,000 nucleotides (Fig. 5 ). After 12,000 nucleotides the rate of reduction of poorly supported branches decreased, and even with ten times more data than we have obtained five nodes still had bootstrap support of less than 50%. The parametric approach using Seq-Gen found that all branches of the phylogeny would have greater than 50% bootstrap support at around 12,000 nucleotides (Fig. 5). However, the nonparametric approach is more likely to be realistic, simulating normal “messy” data.
Fig. 5

The number of branches with less than 50% bootstrap support obtained by increasing the number of characters resampled in the bootstrap function of PAUP* (nonparametric approach) or by using Seq-Gen to generate data from the likelihood equation calculated from the original data (parametric approach).

The number of branches with less than 50% bootstrap support obtained by increasing the number of characters resampled in the bootstrap function of PAUP* (nonparametric approach) or by using Seq-Gen to generate data from the likelihood equation calculated from the original data (parametric approach).

Number of taxa

Reducing the number of taxa in the dataset had little effect on the proportion of branches in the phylogeny with less than 50% bootstrap support (Fig. 6 ). Between 50% and 71% of the branches had bootstrap support of less than 50% depending of the number of taxa in the data set.
Fig. 6

The proportion of branches with less than 50% bootstrap support for a phylogeny estimated using MP from the actual dataset of 45 taxa and with various numbers of taxa deleted randomly.

The proportion of branches with less than 50% bootstrap support for a phylogeny estimated using MP from the actual dataset of 45 taxa and with various numbers of taxa deleted randomly.

Addition of morphological characters

The Bayesian analysis of 53 morphological and 3031 molecular characters reduced the number of nodes with posterior probabilities of less than 0.9 from nine branches to three (Fig. 7 ). The phylogeny estimated from the Bayesian analysis of the seven genes alone differed from the phylogeny estimated from seven genes and 53 morphological characters in only two places. Fornicia moved from being part of a clade with Hypomicrogaster and Parapanteles (Fig. 3) in the phylogeny estimated from the genetic data alone, to being a sister to Venanides, Glyptapanteles and Cotesia (Fig. 7). Dolichogenidea and Pholetesor moved to being a sister to Hypomicrogaster, Promicrogaster, Parapanteles and Sendaphne in the combined genetic and morphological dataset. The placements of Fornicia and Pholetesor both had posterior probabilities of less than 0.9 in the phylogeny estimated from genes alone and only the branch placing Fornicia improved markedly in support (from 0.63 to 0.98).
Fig. 7

Phylogeny estimated from seven genes and 53 morphological characters using Bayesian mixed models (see text for details). Broken lines represent nodes with Bayesian posterior probabilities of less than 0.9.

Phylogeny estimated from seven genes and 53 morphological characters using Bayesian mixed models (see text for details). Broken lines represent nodes with Bayesian posterior probabilities of less than 0.9.

Discussion

We identified genes that diverge more slowly than those already sequenced for Microgastrinae and their addition resulted in a more robust phylogeny for Microgastrinae, as assessed by higher nonparametric bootstrap proportions and posterior probabilities. However, despite substantially increasing the size of the dataset, we still did not obtain a completely robustly supported phylogeny using several methods of phylogeny estimation. Indeed, our nonparametric bootstrap simulations suggest we are unlikely to get a completely supported phylogeny from DNA alone. Although bootstrap support is not a direct measure of phylogenetic accuracy, most authors at least implicitly interpret the figures as rough measures of statistical support for a node (Buckley and Cunningham, 2002). Support for the branches in our phylogeny could not be increased markedly by the method of analysis alone. The LogDet method of analysis is less affected by nonstationary data than is MP (Lockhart et al., 1994). However, using the LogDet transformation also did not produce a totally robustly supported tree. Excluding character sets did not produce a totally supported tree. For example, when a MP analysis of all data was compared to a MP analysis of the data with the third codon positions excluded, similar numbers of nodes had bootstrap support greater than 50%. A marked improvement in support for our phylogeny was seen however when we added morphological characters and used a mixed model Bayesian analysis. Several other studies on diverse groups such as weevils (Marvaldi et al., 2002), molluscs (Collin, 2003) and feather mites (Dabert et al., 2001) have noted improvements in resolution and statistical support when analyses are conducted on combined morphological and DNA data. Dabert et al. (2001) also noted that molecular data alone tended to produce trees with better resolution and support at the terminal tips, and poor resolution and poor support deeper in the phylogeny, whereas the opposite (i.e., better resolution and support deeper in the phylogeny and poor resolution and support at the tips) occurred for phylogenies estimated from morphological data alone. Short branches deep in a phylogeny are notoriously difficult to resolve. These branches will invariably have poor support as they are short due to a paucity of shared derived characters (Mardulyn and Whitfield, 1999) and we found that branches with less than 50% bootstrap support were significantly shorter than branches with greater than 50% support. In the case of the microgastrines, the short branches deeper in the phylogeny may be associated with the radiation of the Ditrysia (which contains 98% of present day lepidopteran species) that occurred from approximately 60 to 70 mya in the late-Cretaceous or early Cenozoic (Grimaldi, 1999). This radiation coincides approximately with the origin of the microgastrine group calculated by Whitfield (2002). The ideal gene to resolve short deep branches should have a fast rate of divergence at the time of the radiation but then the gene’s rate of divergence needs to slow so that informative changes are not obscured by multiple substitutions at each site (Donoghue and Sanderson, 1992, Fishbein et al., 2001). It has been suggested that morphological characters may be more likely than nucleotide substitutions to undergo rapid changes followed by a slowing in the rate of change due to stabilising selection (Fishbein et al., 2001). Thus morphological characters may be a practical method to resolve short deep branches. Also, phenotypic variation is likely influenced by variation in many genes and morphological characters may be a cost effective way to indirectly increase the size of genetic datasets and improve levels of support for phylogenetic estimates.

Effect of methods

The increase in support for our phylogeny when morphological characters were added to the analysis was not due solely to changing from using bootstrap support of a MP analysis to using the posterior probabilities under a Bayesian approach. The Bayesian phylogeny estimated from the genetic data alone had lower levels of support than the Bayesian tree estimated from both genetic and morphological characters. It has been suggested that Bayesian posterior probabilities tend to be higher than MP bootstrap proportions for the same groups (Erixon et al., 2003). However, it has also been suggested that bootstrap values may be a conservative estimate of support for clades when support for clades is strong (Huelsenbeck and Rannala, 2004, Rannala and Yang, 1996) and it is likely that posterior probabilities give a better estimate of support than bootstrap proportions, especially when the most complex Bayesian models are used (Huelsenbeck and Rannala, 2004). We would have liked to compare bootstrap values of phylogenies generated by ML, rather than MP, to the posterior probabilities of the Bayesian trees but this was impractical due to computational constraints. The simulations using the bootstrap function in PAUP* to produce larger character sets from our data suggested that a phylogeny with every branch having more than 50% bootstrap support in a MP analysis was unlikely to be obtained without considerably more data. The approach using Seq-Gen was more optimistic, suggesting complete support was possible with around 12,000 nucleotides. However, the nonparametric method of simulating datasets produces data without gaps or missing data and thus produces a “perfect” dataset that is almost certainly unobtainable in reality. For example, we estimated a phylogeny with only one branch with less than 50% bootstrap support from a dataset simulated with Seq-Gen of the same size as our actual dataset. This compares to 14 branches with less than 50% bootstrap support in the phylogeny estimated from the actual data. The PAUP* based approach produces datasets that are more realistic as the simulated datasets have gaps in the sequences and missing data and is therefore more likely to give a better estimate of the data required to estimate a totally bootstrap supported phylogeny.

Incongruent phylogenies

ILD test

The P-values of <0.05 obtained for many of the ILD tests does not necessarily mean that conflict between individual genes has reduced bootstrap support for nodes in our phylogeny. There is significant disagreement as to what level of significance should be used to reject partition homogeneity. For example, Cunningham (1997) suggested a critical value of less than 0.01 should be used. There is also controversy over whether significant heterogeneity should preclude combining data derived from different genes for a phylogenetic analysis. Yoder et al. (2001) examined the effect of changing character weighting and/or data combinations on the phylogenetics of slow lorises and found that correct results were poorly supported and even incorrect results were obtained when character weighting and/or data combinations were altered to reduce incongruence (as assessed by the ILD). Likewise, Sullivan (1996) found that combining two heterogeneous datasets produced a better estimate of deer mouse and grasshopper mouse phylogenies than did either gene alone. Barker and Lutzoni (2002) also found from simulations that the ILD test was a relatively poor predictor of the effect of combining datasets on phylogenetic accuracy. Dolphin et al. (2000) found that when rate differences between the two matrices being assessed reach a certain level, the ILD test could suggest significant heterogeneity despite the two matrices having similar underlying topologies. It seems likely that the marked differences in the divergence rates of the genes we analysed have generated the significant ILD test results. We suggest that we reduced the adverse effects of data heterogeneity by using complex evolutionary models for each partition of our data in a Bayesian analysis.

Wrong/Lack of data

Inappropriate gene choice has been suggested as a reason why it has been difficult to obtain a robust microgastrine phylogeny (Mardulyn and Whitfield, 1999). Several of the genes we used have been used in other studies of braconid phylogenetics (for example, Belshaw et al., 1998, Mardulyn and Whitfield, 1999, Michel-Salzat and Whitfield, 2004, Min et al., 2005, Whitfield et al., 2002), and in other studies of hymenopteran phylogenetics (Cameron and Mardulyn, 2001, Danforth et al., 2004, Kawakita et al., 2003, Sanchis et al., 2001, Weiblen, 2004). Given that our choice of genes covered deep, medium and shallow divergences and that these genes have been used successfully to estimate robust phylogenies for an enormous variety of taxa, we do not conclude that inappropriate gene choice has caused the poor support for some of the nodes. The contribution to the phylogenies from all the COI data is likely subject to long branch attraction (Felsenstein, 1978, Hendy and Penny, 1989) as this gene has the highest uncorrected pairwise divergences of the seven genes between microgastrine species and yet it has only the fifth highest levels of divergence between the braconid subfamilies. However, using model-based methods for estimating the phylogenies has probably lessened the effect of long branch attraction. Missing data are also unlikely to have resulted in poor bootstrap support. A phylogeny estimated from those taxa for which we obtained sequences for all seven genes had six branches (of 51 total) with bootstrap support <50% in a MP analysis. The poorly supported branches were significantly shorter than branches with >50% bootstrap support. An examination of the contribution of individual genes to the branch length of the poorly supported branches found that there were few changes in all seven genes for these short branches, suggesting that a rapid radiation (i.e., truly short branches) has indeed occurred. Insufficient data has also been suggested as a reason for poorly supported phylogenies. Rokas et al. (2003) suggested that 20 genes would be required to obtain mean bootstrap values of 95% with a 95% confidence interval for seven species of Saccharomyces yeasts. However, as few as eight genes would give mean bootstrap values of 95% with a 95% confidence interval if nonstationary genes (genes that have markedly shifted nucleotide frequencies for some taxa) were excluded from the Saccharomyces analysis (Collins et al., 2005). Deleting the third positions of codons from our data resulted in the genes becoming stationary. However, while deletion of third positions did not markedly alter the relationships estimated, it also did not markedly increase bootstrap support for our MP trees.

Effect of increasing taxa

There has been debate over whether it is better to add taxa or characters to an analysis to improve accuracy, given that resources are always finite. Rosenberg and Kumar (2001) suggested that longer sequences, rather than more taxa, will improve the accuracy of the phylogeny estimated. However, it was also argued that increasing the number of taxa equally reduces error in phylogenetic estimations (Pollock et al., 2002). The improvement in phylogenetic accuracy is in part determined by the length of sequences already obtained and by the levels of divergence between the taxa (Hillis et al., 2003). For example, if there are several long branches in the phylogeny, effort may be better expended on sequencing taxa that break up the long branches rather than adding more characters (Graybeal, 1998). In the microgastrine case, our simulations showed that neither adding taxa nor genetic data would increase bootstrap support for the short branches in the phylogeny.

Phylogenetic results

We found strong support (100% of bootstrap replicates from the MP analysis of the seven genes, and posterior probabilities of 0.99 for both the molecular data and the molecular data plus morphology) for monophyly of the microgastroid complex, sensu Wharton, 1993, Whitfield and Mason, 1994, Whitfield, 1997a, Dowton and Austin, 1998. Our finding of monophyly for Microgastrinae agrees with an earlier analysis of 16S data that also found the microgastroid complex to be monophyletic, although with equivocal bootstrap support (Dowton et al., 1998, Whitfield, 1997b). An analysis of portions of 16S and 28S rDNA also found strong bootstrap support for monophyly of the microgastroids (Dowton and Austin, 1998). Our Bayesian analysis of the combined molecular and morphological characters found a relationship for the braconid subfamilies of (Cheloninae, (Mendesellinae, (Microgastrinae, (Cardiochilinae, Miracinae)))). Belshaw et al. (1998) found a similar relationship for the microgastrine subfamilies, excluding Mendesellinae, from an analysis of a portion of the 28S region. These results however, conflict with a MP analysis of portions of the 16S and 28S genes and 11 morphological characters (Dowton and Austin, 1998), and a Bayesian analysis of portions of 16S 18S and 28S regions and 96 morphological characters (Min et al., 2005), that found a relationship of (Adelinae + Cheloninae, (Miracinae, (Microgastrinae, Cardiochilinae))). We intend a more extensive examination of subfamily relationships within the microgastroid complex in the near future. It is difficult to compare our estimate of relationships within Microgastrinae to other studies, as generally different or fewer microgastrine species were sampled in those studies (e.g., Belshaw et al., 1998, Dowton et al., 1998). We found Microplitis and Snellenius represent an early diverging lineage of microgastrines. A phylogeny estimated from portions of 16S and 28S also found Microplitis to be basal (Dowton and Austin, 1998) as did a phylogeny estimated from a portion of 28S (Mardulyn and Whitfield, 1999).
  60 in total

1.  Phylogenetic signal in the COI, 16S, and 28S genes for inferring relationships among genera of Microgastrinae (Hymenoptera; Braconidae): evidence of a high diversification rate in this group of parasitoids.

Authors:  P Mardulyn; J B Whitfield
Journal:  Mol Phylogenet Evol       Date:  1999-08       Impact factor: 4.286

2.  Nuclear genes resolve mesozoic-aged divergences in the insect order Lepidoptera.

Authors:  B M Wiegmann; C Mitter; J C Regier; T P Friedlander; D M Wagner; E S Nielsen
Journal:  Mol Phylogenet Evol       Date:  2000-05       Impact factor: 4.286

3.  Timing the ancestor of the HIV-1 pandemic strains.

Authors:  B Korber; M Muldoon; J Theiler; F Gao; R Gupta; A Lapedes; B H Hahn; S Wolinsky; T Bhattacharya
Journal:  Science       Date:  2000-06-09       Impact factor: 47.728

4.  Incomplete taxon sampling is not a problem for phylogenetic inference.

Authors:  M S Rosenberg; S Kumar
Journal:  Proc Natl Acad Sci U S A       Date:  2001-08-28       Impact factor: 11.205

5.  Genome-scale approaches to resolving incongruence in molecular phylogenies.

Authors:  Antonis Rokas; Barry L Williams; Nicole King; Sean B Carroll
Journal:  Nature       Date:  2003-10-23       Impact factor: 49.962

6.  Phylogeny of Saxifragales (angiosperms, eudicots): analysis of a rapid, ancient radiation.

Authors:  M Fishbein; C Hibsch-Jetter; D E Soltis; L Hufford
Journal:  Syst Biol       Date:  2001 Nov-Dec       Impact factor: 15.683

7.  Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species.

Authors:  Paul D N Hebert; Sujeevan Ratnasingham; Jeremy R deWaard
Journal:  Proc Biol Sci       Date:  2003-08-07       Impact factor: 5.349

8.  The utility of the incongruence length difference test.

Authors:  F Keith Barker; François M Lutzoni
Journal:  Syst Biol       Date:  2002-08       Impact factor: 15.683

9.  Increased taxon sampling is advantageous for phylogenetic inference.

Authors:  David D Pollock; Derrick J Zwickl; Jimmy A McGuire; David M Hillis
Journal:  Syst Biol       Date:  2002-08       Impact factor: 15.683

10.  Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models.

Authors:  John Huelsenbeck; Bruce Rannala
Journal:  Syst Biol       Date:  2004-12       Impact factor: 15.683

View more
  9 in total

1.  Evolution of the parasitic wasp subfamily Rogadinae (Braconidae): phylogeny and evolution of lepidopteran host ranges and mummy characteristics.

Authors:  Alejandro Zaldívar-Riverón; Mark R Shaw; Alberto G Sáez; Miharu Mori; Sergey A Belokoblylskij; Scott R Shaw; Donald L J Quicke
Journal:  BMC Evol Biol       Date:  2008-12-04       Impact factor: 3.260

2.  Phylogenetic analysis and temporal diversification of mosquitoes (Diptera: Culicidae) based on nuclear genes and morphology.

Authors:  Kyanne R Reidenbach; Shelley Cook; Matthew A Bertone; Ralph E Harbach; Brian M Wiegmann; Nora J Besansky
Journal:  BMC Evol Biol       Date:  2009-12-22       Impact factor: 3.260

3.  Genomic and morphological features of a banchine polydnavirus: comparison with bracoviruses and ichnoviruses.

Authors:  Renée Lapointe; Kohjiro Tanaka; Walter E Barney; James B Whitfield; Jonathan C Banks; Catherine Béliveau; Don Stoltz; Bruce A Webb; Michel Cusson
Journal:  J Virol       Date:  2007-04-11       Impact factor: 5.103

4.  A molecular phylogeny of the Chalcidoidea (Hymenoptera).

Authors:  James B Munro; John M Heraty; Roger A Burks; David Hawks; Jason Mottern; Astrid Cruaud; Jean-Yves Rasplus; Petr Jansta
Journal:  PLoS One       Date:  2011-11-03       Impact factor: 3.240

5.  Molecular identification of sibling species of Sclerodermus (Hymenoptera: Bethylidae) that parasitize buprestid and cerambycid beetles by using partial sequences of mitochondrial DNA cytochrome oxidase subunit 1 and 28S ribosomal RNA gene.

Authors:  Yuan Jiang; Zhongqi Yang; Xiaoyi Wang; Yuxia Hou
Journal:  PLoS One       Date:  2015-03-17       Impact factor: 3.240

6.  A Horizontally Transferred Autonomous Helitron Became a Full Polydnavirus Segment in Cotesia vestalis.

Authors:  Pedro Heringer; Guilherme B Dias; Gustavo C S Kuhn
Journal:  G3 (Bethesda)       Date:  2017-12-04       Impact factor: 3.154

7.  Reared microgastrine wasps (Hymenoptera: Braconidae) from Yanayacu Biological Station and environs (Napo Province, Ecuador): diversity and host specialization.

Authors:  James B Whitfield; Josephine J Rodriguez; Paul K Masonick
Journal:  J Insect Sci       Date:  2009       Impact factor: 1.857

8.  Impact of duplicate gene copies on phylogenetic analysis and divergence time estimates in butterflies.

Authors:  Nélida Pohl; Marilou P Sison-Mangus; Emily N Yee; Saif W Liswi; Adriana D Briscoe
Journal:  BMC Evol Biol       Date:  2009-05-13       Impact factor: 3.260

9.  Evolutionary relationships of courtship songs in the parasitic wasp genus, Cotesia (Hymenoptera: Braconidae).

Authors:  Justin P Bredlau; Karen M Kester
Journal:  PLoS One       Date:  2019-01-04       Impact factor: 3.240

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.