Literature DB >> 26543089

Use of microsatellite and SNP markers for biotype characterization in Hessian fly.

Brandon J Schemerhorn¹, Yan Ma Crane², Sue E Cambron², Charles F Crane³, Richard H Shukle².

Abstract

Exploration of the biotype structure of Hessian fly, Mayetiola destructor (Say) (Diptera: Cecidomyiidae), would improve our knowledge regarding variation in virulence phenotypes and difference in genetic background. Microsatellites (simple sequence repeats) and single-nucleotide polymorphisms (SNPs) are highly variable genetic markers that are widely used in population genetic studies. This study developed and tested a panel of 18 microsatellite and 22 SNP markers to investigate the genetic structure of nine Hessian fly biotypes: B, C, D, E, GP, L, O, vH9, and vH13. The simple sequence repeats were more polymorphic than the SNP markers, and their neighbor-joining trees differed in consequence. Microsatellites suggested a simple geographic association of related biotypes that did not progressively gain virulence with increasing genetic distance from a founder type. Use of the k-means clustering algorithm in the STRUCTURE program shows that the nine biotypes comprise six to eight populations that are related to geography or history within laboratory cultures. Published by Oxford University Press on behalf of the Entomological Society of America 2015. This work is written by US Government employees and is in the public domain in the US.

Entities: CellLine Chemical Disease Mutation Species

Keywords: Hessian fly; biotype; microsatellite; single-nucleotide polymorphism

Mesh：

Substances：
Genetic Markers
DNA

Year: 2015 PMID： 26543089 PMCID： PMC4633977 DOI： 10.1093/jisesa/iev138

Source DB: PubMed Journal: J Insect Sci ISSN： 1536-2442 Impact factor: 1.857

The Hessian fly, Mayetiola destructor (Say), which supposedly entered the United States in the straw bedding of Hessian mercenaries during the American Revolution, was first reported in the United States from Long Island, New York, in 1779 (Fitch 1847, Ratcliffe et al. 2000). Although current cultural practices, particularly late planting and the use of resistant varieties, have limited its toll, Hessian fly remains a primary pest of wheat throughout most of the United States, and it is the most destructive insect pest on cereals worldwide (Prescott et al. 1986, Smiley et al. 2004). Walsh (1864) was the first to describe morphologically identical insects, which differ heritably in host preference, as “phytophagic varieties.” The term “biotype” has been applied primarily to distinguish populations that differ only in a particular trait, such as survival and development on a particular host, or host preference for feeding, oviposition, or both, like most of Walsh’s “phytophagic varieties.” The named Hessian fly biotypes began either from laboratory selection experiments (Gallun et al. 1961) or as samples from natural populations (Cartwright and Nobel 1947). Sixteen Hessian fly biotypes, classified as “GP” (named for the Great Plains region of the central United States) and “A” through “O,” have been identified on the basis of their differential, gene-for-gene response to four resistance genes in wheat (Gallun 1977). The resistance genes are H3 (carried in cultivar “Monon”), H5 (carried in cultivar “Magnum”), H6 (carried in cultivar “Caldwell”), and H7H8 (carried in cultivar “Seneca”) (Ratcliffe and Hatchett 1997). However, additional biotypes have been found on the basis of virulence to other resistance genes; thus biotypes vH9 and vH13 respectively are virulent against resistance genes H9 and H13 (Formusoh et al. 1996, Zantoko and Shukle 1997). At least 30 Hessian-fly resistance genes have been described in wheat and related species in Aegilops and Secale (Li et al. 2015), although the number of distinct genes is probably less than that because the reporting studies use non-overlapping sets of resistant and susceptible wheat varieties to analyze segregation of the resistance in usually a single resistant variety. Since the studies are not cross-referenced, duplicate reports of the same gene are likely, and also different alleles of the same locus can be named as independent H genes. This particularly applies to an 11 centimorgan region on wheat chromosome 1AS that harbors several of the best-mapped H genes (Li et al. 2015). The potential number of Hessian fly biotypes based on all of these additional genes is very large, 2 for n distinct H genes. In Hessian fly, the allele for avirulence is dominant to the allele(s) for virulence at the same locus (Bouhssini et al. 2001). The Great Plains Biotype is not virulent against any of the listed resistance genes. Therefore, it is considered to have been predominant in the Hessian fly field population prior to the widespread cultivation of resistant wheat varieties and subsequent natural selection for virulence to the resistance genes in these varieties. Biotype “L” is the most virulent and can attack wheat varieties with any of these listed genes (Ratcliffe et al. 1994, 1997, 2000, 2002). All these biotypes are morphologically identical as larvae and as adults, which is typical of host-specific plant parasites and insect parasitoids (Diehl and Bush 1984). Each resistance gene usually confers resistance to one or a few biotypes of Hessian fly. As virulence to the currently deployed resistance genes evolves in Hessian fly, wheat breeders search for additional resistance genes to incorporate into elite wheat varieties (Kudagamage et al. 1990, Ratcliffe et al. 1994). Our laboratory maintains stocks of nine of the named Hessian fly biotypes for research into the basis of virulence and host resistance. We also want to understand their inter-relatedness, because this provides insight into how virulence evolves to newly deployed H genes in wheat, and such understanding might help to increase the durability of H genes deployed in the future. In particular, does increased virulence evolve through an orderly progression of incrementally more virulent stages, or does it arise sporadically and suddenly by single mutations without regard to intensity? Or does increased virulence evolve in some intermediate way? Microsatellites (simple sequence repeat [SSRs]) and single-nucleotide polymorphisms (SNPs) are popular, easily used molecular markers (Vignal et al. 2002) that are routinely used in crop breeding, recombinational and physical genetic mapping, association mapping, and analysis of quantitative trait loci and thus have helped to increase yield and decrease pest damage (Al-Maskri et al. 2012). We have developed and applied a set of 18 Hessian fly SSR markers for the estimation of genetic variability, population structure, and gene flow among geographically diverse Hessian fly populations (Schemerhorn et al. 2008, 2009, 2014). Here, we have used the same 18 SSR markers, plus 22 SNP markers, to investigate the genetic relationship of 9 of the 16 named Hessian fly biotypes. Eleven of the SNP markers are introduced in this study, while the other 11 have been used in our laboratory to investigate the geographic diversity of Hessian fly in North America (Schemerhorn et al. 2014). We analyzed allele and genotype frequencies to better understand how the biotypes evolved and how they have responded to long-term maintenance in our laboratory setting.

Materials and Methods

Sample Collection and DNA Extraction

Hessian fly [Mayetiola destructor (Say)] biotypes B, C, D, E, GP, L, O, vH9, and vH13 were sampled from our laboratory cultures (Table 2). For each biotype, 48 female flies were collected and stored individually in 100% ethanol before extraction of DNA as described by Schemerhorn et al. (2008, 2009).

Table 2.

Characteristics and sources of the nine studied biotypes

Biotype	Origin or prevalent region	Characteristics
B	Prevalent in parts of the eastern soft wheat region, particularly in the Middle West.	Can attack wheats that have no genes for resistance or that have resistances similar to those of H7H8 and H3. Cannot attack wheats having resistance similar to that of H6 or H5.
C	Selected in a greenhouse experiment from a population of biotype A that prevails in the eastern United States.	Can attack wheats that have no genes for resistance or that have resistances similar to those of H7H8 and H6. It cannot attack wheats that have resistances similar to those of H3 or H5. It can live on H9.
D	Collected from a field in Indiana in the 1970s.	This virulent biotype can attack almost any wheat that lacks H5.
E	Found in a wheat field in Georgia	Avirulent on H5, H6, and H7H8.
GP	Maybe the original biotype that entered the eastern United States 200 years ago. Collected in Kansas.	This least virulent of the biotypes can attack only wheats having no genes for resistance, such as the variety Turkey.
L	Southern Indiana field	Attacks wheats with H3, H6, H5, and H7H8, avirulent H26
O	Southern Georgia field	Attacks wheats with H3, H6, H5, but not H7H8
vH9	Selected out of field population from south Georgia	Virulent on H3, H5, H6, H7H8, H9, but not H13
vH13	Selected from bulk biotype E	Virulent on H3 and H13, avirulent on H5 and H9, segregating for virulence on H6, H7H8, and H26

Microsatellite and SNP Data Collection

Forty molecular markers were tested for each of the 432 DNA samples representing nine biotypes and 48 female flies per biotype. There were 18 fluorescently labeled SSR primer pairs, Hf14, Hf32, Hf55, Hf56, Hf71, Hf101, Hf102, Hf103, Hf104, Hf108, Hf109, Hf112, Hf113, Hf114, Hf116, Hf124, Hf164, and Hf174. There were 11 SNPs, HfSNPa20, HfSNPa26, HfSNPa32, HfSNPa38, HfSNPa44, HfSNPa50, HfSNPa56, HfSNPa63, HfSNPa70, HfSNPa76, and HfSNPa81, that we have developed and used previously (Schemerhorn et al. 2008, 2014). Eleven new SNP markers, HfSNPs21, HfSNPs26, HfSNPs31, HfSNPs39, HfSNPs45, HfSNPs50, HfSNPs57, HfSNPs62, HfSNPs69, HfSNPs74, and HfSNPs80, were developed for this study from BAC sequences that had been mapped to the sex chromosome by FISH (Schemerhorn et al. 2009). The primer development and polymerase chain reaction conditions followed Schemerhorn et al. (2014) and Rozen and Skaletsky (2000). Each single-base-extension interrogation primer was chosen to end immediately 5’ to a SNP within a previously amplified product (Table 1). The polymerase chain reaction provided enough template to allow fluorescent detection of the added base by single-base extension. The interior primers varied from 20 to 81 bp in length with annealing temperature from 69 to 79°C, to allow multiplexed detection (Jiang et al. 2002, Sanchez and Endicott 2006, Schemerhorn et al. 2014). After the primer extension reaction, each set of 11 multiplexed SNP extension products, together with the GenomeLab DNA size-standard kit 80, was subjected to capillary electrophoresis on a CEQ8000 Genetic Analysis System (Beckman Coulter, Brea, CA, Fig. 1), using its SNP detection program. All SSR markers were likewise detected by a single operator on this CEQ8000 Genetic Analysis System.

Table 1.

Characteristics of 11 sex-chromosomal SNP loci used in 9 Hessian fly biotypes

Name	GenBank	PCR primers	T_m (°C)	SNP interrogation primer	T_m (°C)	AS (nt)
HfSNPs21	KJ917318	F: AACAGAGAAAAGATGACCCAAATC	60	R: ACGGAGGATGGCATGTGGCAA	70	22
HfSNPs21	KJ917318	R: ATGAGATAATCAGTTAAATCACGACCAG	62		70	22
HfSNPs26	KJ917329	F: TGGCTGCTGGTATTCTAACCAAAAACATCG	71	R: GCCAAGTAACAATTGACCCAAACCGA	69	26
HfSNPs26	KJ917329	R: TTCTCGGCCCAATCCAGACTGCTTGTA	73			26
HfSNPs31	KJ917329	F: TGGCTGCTGGTATTCTAACCAAAAACATCG	71	F: ATAATGCCTTGGTATTTTCGAATGCTGTTGA	70	31
HfSNPs31	KJ917329	R: TTCTCGGCCCAATCCAGACTGCTTGTA	73			31
HfSNPs39	KJ917322	F: GCACAGGCAGTACCCACGACTAGGTCA	72	R: (A)₁₀GTATCGGCCCACAACAACACTCGATGAAG	76	37
HfSNPs39	KJ917322	R: GTATCGGCCCACAACAACACTCGATGA	72		76	37
HfSNPs45	KJ917310	F: ATTTTGATGGTTGATGGGGCCAATGAG	72	F: (A)₁₄TGGTACCACCAGACATAACGATGTTAGCGTA	75	45
HfSNPs45	KJ917310	R: GGTAATGAACGTTTCCGTTGCCCAGAA	72			45
HfSNPs50	KJ917318	F: CGCACTTGACTTTGAACAAGAAATGGCTACC	72	F: (A)₁₉ GAAATGCGATGTTGATATCCGTAAGGACTTG	76	50
HfSNPs50	KJ917318	R: TCCTTTTGCATACGATCAGCAATACCTGGA	72			50
HfSNPs57	KJ917318	F: CGCACTTGACTTTGAACAAGAAATGGCTACC	72	R: (A)₂₁TTTCATGATTGAATTGTAGACAGTTTCGTGAATACC	76	56
HfSNPs57	KJ917318	R: CCTTTTGCATACGATCAGCAATACCTGGA	72		76	56
HfSNPs62	KJ917310	F: ATTTTGATGGTTGATGGGGCCAATGAG	72	R: (A)₂₉CATGTATCCAGGTATTGCTGATCGTATGCAAAA	78	63
HfSNPs62	KJ917310	R: GGTAATGAACGTTTCCGTTGCCCAGAA	72			63
HfSNPs69	KJ917310	F: ATTTTGATGGTTGATGGGGCCAATGAG	72	R: (A)₃₃ATTGTTCCAACCATCATTCTTGGGTATGGAATCTTG	79	70
HfSNPs69	KJ917310	R: GGTAATGAACGTTTCCGTTGCCCAGAA	72			70
HfSNPs74	KJ917321	F: ATTATGGGTCCTAGAAGTCGAAATGAAGT	65	F: (A)₅₂ATGTGACGACATTGAAAGAACT	77	73
HfSNPs74	KJ917321	R: TTTTACGAGGAAATTCAACTTCAAGTGTTT	65		77	73
HfSNPs80	KJ917330	F: TGGACTATCTAATTGTGAAAGGTAAAAA	60	R: (A)₄₄TGTGGTCTTAATTATATTCGAAGGAATTGTAAGTTT	76	83
HfSNPs80	KJ917330	R: GCCAGTAAAGAGTTTAATTCCCAAG	61		76	83

PCR, polymerase chain reaction; AS, apparent size.

Fig. 1.

Product sizes of 11 multiplexed SNP loci. The polymerase chain reaction (PCR) products were mixed for capillary electrophoresis with red size standard (13 nt and 80 nt). HfSNPs69 is heterozygous.

Product sizes of 11 multiplexed SNP loci. The polymerase chain reaction (PCR) products were mixed for capillary electrophoresis with red size standard (13 nt and 80 nt). HfSNPs69 is heterozygous. Characteristics of 11 sex-chromosomal SNP loci used in 9 Hessian fly biotypes PCR, polymerase chain reaction; AS, apparent size. Characteristics and sources of the nine studied biotypes

Data Analysis

In total, 17,280 lengths or terminal bases were scored from 18 SSR and 22 SNP markers, with 48 flies in each of nine biotypes. All the microsatellite alleles were scored with the aid of a Perl script that binned fragment lengths and mapped each fragment length to the nearest sufficiently populated bin (Schemerhorn et al. 2014), and the data were checked with Micro-Checker (van Oosterhout et al. 2004) for evidence of null alleles and scoring errors. The program STRUCTURE 2.3.3 was used to estimate the co-ancestry among different biotypes. The most probable number of distinct populations, K, was estimated from simulations in STRUCTURE HARVESTER (Earl 2009). For each of 20 runs per trial value of K from 1 to 9, there were 100,000 burn-in iterations preceding 75,000 Monte Carlo Markov chain replications. The optimum K was determined by evaluating the likelihood of the posterior probability with the following equations (Pritchard et al): Thus, ΔK is an ad hoc quantity related to the change in posterior probabilities between runs of consecutively different K values (Evanno et al). The SSRs and SNP data were analyzed separately in Arlequin 3.1 (Excoffier et al. 2005, Excoffier and Lischer 2010) for linkage disequilibrium, Hardy–Weinberg equilibrium, and pairwise population FST values. Phylogenetic trees were constructed by neighbor joining in Poptree2 (Takezaki et al. 2010) for FST values obtained separately from SSR and SNP data. An additional tree was produced with MEGA5 (Tamura et al. 2011) for Dest values obtained with SMOGD (Crawford 2010) from the combined SSR and SNP data.

Results

The 18 SSR and 22 SNP loci generated 17,280 unique trace profiles from 432 individual flies. The 18 polymorphic markers were examined for frequency within the biotypes and usefulness to assess relationships among the biotypes. Since all the biotypes are currently being maintained as laboratory populations, the analysis reflects not only phylogeny of the source populations from which some of the biotypes were obtained but also selection for virulence to certain resistance genes in wheat and the success of the isolation of the biotypes in long-term cultures. Some of the tested loci varied widely in frequency among the biotypes, which indicates the independence of the biotypes as breeding populations. A case in point is the single SNP locus HfSNPs69, whose four alleles summed to the frequency of 1 (Supp Table 1 [online only]). Among the nine biotypes, the frequency of A varied from 0.281 to 0.969, C varied from 0.000 to 0.094, G varied from 0.000 to 0.719, and T varied from 0.000 to 0.245. Clearly, there are distinct populations within the collection of biotypes. Figure 2 is a Poptree2 dendrogram derived by neighbor joining from FST values based on the SSR loci; Fig. 3 was obtained the same way from the 21 polymorphic SNP loci. Both of these dendrograms were generated with resampling, and bootstrap support out of 100 runs is shown for each nonbasal node. A third dendrogram (Fig. 4) was produced using SMOGD and MEGA5 for the combined data using Dest instead of FST, and bootstrapping could not be performed for it.

Fig. 2.

Neighbor-joining tree for SSR FST values by Poptree2 with 100 bootstrap runs.

Fig. 3.

Neighbor-joining tree from SNP FST data by Poptree2 with 100 bootstrap runs.

Fig. 4.

Neighbor-joining tree by MEGA5 from DEST from SMOGD for SSR and SNP data together.

Neighbor-joining tree for SSR FST values by Poptree2 with 100 bootstrap runs. Neighbor-joining tree from SNP FST data by Poptree2 with 100 bootstrap runs. Neighbor-joining tree by MEGA5 from DEST from SMOGD for SSR and SNP data together. The SSR-based dendrogram in Fig. 2 shows a simple and plausible geographic separation of two clades and a sister group. The first clade consists of biotypes descended from field collections in Georgia: O, E, vH9, and vH13, and the second clade consists of Midwestern biotypes (B, D, and L from Indiana plus GP from Kansas). The outlier, C, was derived in the laboratory from biotype A in the eastern United States. Each clade has a succession of biotypes without any further bifurcation. However, the bootstrap support for specific nodes is weak except for the juxtaposition of E with O at 93%. The SNP-based FST dendrogram in Fig. 3 adds biotype D to the group of Georgian biotypes. Biotype C and the remaining Midwestern biotypes form the other clade. There is moderate bootstrap support for the juxtaposition of D and O and for the grouping of vH9 and vH13 with D and O, while E becomes nearly basal. The other specific nodes are weakly supported. The combined-data dendrogram in Fig. 4, using Dest instead of FST, groups the biotypes geographically in a slightly different way than Fig. 2. Georgia forms one clade, but the Indiana-based biotypes are nearly basal and C and GP push far out in the second clade. Both E and L essentially become nodes in their respective branches, thus assuming directly ancestral relationship to O and GP, respectively. All three dendrograms grouped laboratory-selected biotypes vH9 and vH13 with their wild parents, respectively, biotypes O and E (Table 2), but none of the dendrograms placed vH9 and vH13 in descendent positions from O and E. This accords with the low bootstrap support for most nodes and with the possibility that laboratory selection has reversed the divergence of E and O from more ancestral strains. In analysis of SSRs and SNPs jointly with the program STRUCTURE, the value of ΔK peaked at K = 8, but there was also a clear peak at K = 6 (Fig. 5), indicating that the biotypes represent six to eight distinct populations. To produce Fig. 6, STRUCTURE plotted q, the posterior estimated fraction of each individual’s genotype that came from one of the K populations, across the biotypes. If a total of less than nine populations is accepted, then there must be relatively closer similarities among some of the populations. With K = 8, at least one pair of biotypes must be merged, and in Fig. 6, the Georgian biotypes O and E share a common ancestor that has combined with the ancestor of D to produce O and with the ancestor of vH13 to produce E. However, this interpretation does not accord with the documented laboratory selection of vH13 from E (Zantoko and Shukle 1997). With K = 6, biotypes O, E, and vH13 combine into one population, just like their closest grouping in Fig. 4, while B is merged with D in accord with their basal positions in Fig. 4. With either value of K, biotype O is the most similar to more than one population, as evidenced by its multicolored plotted q values in Fig. 6.

Fig. 5.

Fig. 6.

Structure 2.3.3 bar plot showing relationships among biotypes.

Estimates of ΔK, based on the second order rate of change of the likelihood function with respect to K, the expected number of populations (clusters), to determine K’s most likely value for the data set. Genetic distance and ΔK are plotted against the number of clusters. Structure 2.3.3 bar plot showing relationships among biotypes. Overall, the 18 SSR markers were more informative about the relationships among the nine biotypes, and the 22 SNP markers were more likely to be fixed within particular biotypes. The biotypes varied less at most SNP loci than at SSR loci (Table 3). The expected heterozygosity was usually lower for SNPs (Table 3, means in last column), because there were more alleles per SSR than there were variant nucleotides at a single position, as confirmed in Supp Table 1 (online only).

Table 3.

Standard diversity indices

Loci	D					O					GP					L					B
Loci	N	AR	H_O	H_E	HWE	N	AR	H_O	H_E	HWE	N	AR	H_O	H_E	HWE	N	AR	H_O	H_E	HWE	N	AR	H_O	H_E	HWE
Hf109	2	6	0.00	0.04	*	4	9	0.42	0.66	*	2	6	0.00	0.08	**	2	6	0.13	0.22	*	4	35	0.00	0.12	**
Hf113	8	27	0.21	0.50	**	4	7	0.27	0.54	**	6	39	0.38	0.46		6	29	0.31	0.55	**	5	46	0.33	0.56	**
Hf101	4	21	0.00	0.16	**	2	3	0.13	0.12		3	10	0.00	0.08	**	1	0	NA	NA	NA	4	10	0.00	0.12	**
Hf164	6	13	0.54	0.58	**	5	13	0.69	0.62		4	13	0.71	0.70		4	13	0.60	0.59		3	12	0.65	0.65
Hf112	2	6	0.04	0.15	*	2	6	0.19	0.49	**	2	6	0.13	0.36	**	2	6	0.02	0.02		2	6	0.15	0.27	*
Hf116	7	55	0.46	0.72	**	7	39	0.48	0.76	**	11	33	0.56	0.84	**	7	39	0.63	0.77	**	9	44	0.67	0.78	*
Hf32	8	56	0.46	0.79	**	9	41	0.27	0.77	**	8	29	0.08	0.68	**	8	29	0.15	0.61	**	10	41	0.17	0.69	**
Hf103	4	7	0.38	0.54	*	2	3	0.31	0.37		7	20	0.27	0.64	**	3	5	0.25	0.32		3	5	0.44	0.52
Hf108	5	29	0.67	0.61		4	12	0.54	0.55	*	4	12	0.48	0.72	**	4	12	0.73	0.69		4	12	0.60	0.60
Hf114	5	27	0.13	0.20	*	4	24	0.23	0.28		6	24	0.69	0.70		6	24	0.21	0.67	**	3	24	0.44	0.49
Hf102	2	3	0.15	0.27	*	3	9	0.19	0.31	**	3	3	0.44	0.50		2	3	0.10	0.14		3	7	0.29	0.44	*
Hf14	3	6	0.35	0.49	*	4	9	0.63	0.60		3	6	0.58	0.59		4	9	0.71	0.67		3	6	0.39	0.50
Hf104	3	6	0.25	0.46	**	4	10	0.21	0.68	**	6	17	0.33	0.65	**	6	16	0.31	0.64	**	6	25	0.04	0.56	**
Hf71	5	19	0.46	0.68	*	5	19	0.58	0.72	*	5	19	0.48	0.50		4	10	0.52	0.70	**	4	10	0.33	0.71	**
Hf56	4	9	0.44	0.68	**	6	30	0.40	0.64	**	3	9	0.58	0.66		4	9	0.63	0.64		4	15	0.27	0.31
Hf124	8	30	0.21	0.37	**	5	14	0.31	0.72	**	3	9	0.33	0.57	**	5	12	0.29	0.48	*	3	6	0.06	0.10	*
Hf55	3	16	0.10	0.14	*	4	20	0.25	0.23		2	5	0.19	0.21		2	5	0.25	0.31		3	18	0.04	0.08	*
Hf174	5	11	0.40	0.46	*	6	11	0.10	0.54	**	5	47	0.44	0.65	**	5	11	0.54	0.57		5	9	0.23	0.46	**
HfSNPa20	1	C	NA	NA	NA	1	C	NA	NA	NA	1	C	NA	NA	NA	1	C	NA	NA	NA	2	C,T	0.02	0.02
HfSNPa26	1	G	NA	NA	NA	1	G	NA	NA	NA	2	A,G	0.15	0.24	*	2	A,G	0.08	0.08		1	G	NA	NA	NA
HfSNPa32	1	A	NA	NA	NA	1	A	NA	NA	NA	1	A	NA	NA	NA	1	A	NA	NA	NA	1	A	NA	NA	NA
HfSNPa38	1	T	NA	NA	NA	2	C,T	0.00	0.04	*	2	C,T	0.33	0.43		2	C,T	0.10	0.17	*	1	T	NA	NA	NA
HfSNPa44	2	A,G	0.00	0.08	**	2	A,G	0.00	0.12	**	2	A,G	0.23	0.39	*	2	A,G	0.29	0.40		2	A,G	0.08	0.22	**
HfSNPa50	2	A,G	0.81	0.50	**	2	A,G	0.52	0.39	*	2	A,G	0.52	0.49		2	A,G	0.69	0.47	*	2	A,G	0.02	0.14	**
HfSNPa56	2	C,T	0.06	0.06		2	C,T	0.02	0.02		2	C,T	0.15	0.21		2	C,T	0.21	0.19		2	C,T	0.42	0.45
HfSNPa63	1	G	NA	NA	NA	2	A,G	0.06	0.06		2	A,G	0.21	0.19		2	A,G	0.21	0.19		1	G	NA	NA	NA
HfSNPa70	1	C	NA	NA	NA	1	C	NA	NA	NA	1	C	NA	NA	NA	1	C	NA	NA	NA	1	C	NA	NA	NA
HfSNPa76	1	G	NA	NA	NA	2	A,G	0.00	0.15	**	1	G	NA	NA	NA	2	A,G	0.02	0.02		1	G	NA	NA	NA
HfSNPa81	1	G	NA	NA	NA	1	G	NA	NA	NA	1	G	NA	NA	NA	1	G	NA	NA	NA	1	G	NA	NA	NA
HfSNPs21	2	A,C	0.04	0.04		2	A,C	0.02	0.02		1	C	NA	NA	NA	2	A,C	0.23	0.21		2	A,C	0.29	0.25
HfSNPs26	2	A,C	0.27	0.24		2	A,C	0.17	0.22		1	A	NA	NA	NA	1	A	NA	NA	NA	2	A,C	0.04	0.04
HfSNPs31	2	A,G	0.13	0.12		2	A,G	0.10	0.10		1	A	NA	NA	NANA	2	A,G	0.22	0.27		2	A,G	0.79	0.48	**
HfSNPs39	1	T	NA	NA	NA	1	T	NA	NA	NA	1	T	NA	NA	NA	3	A,C,T	0.19	0.21		2	C,T	0.02	0.02
HfSNPs45	3	A,C,T	0.40	0.34		3	A,C,T	0.35	0.31		2	C,T	0.98	0.51		2	C,T	0.19	0.18		2	A,C	0.04	0.04
HfSNPs50	2	A,G	0.08	0.12		2	A,G	0.04	0.08		2	A,G	0.07	0.07		2	A,G	0.15	0.24	*	2	A,G	0.35	0.32
HfSNPs57	2	C,T	0.52	0.41		2	C,T	0.15	0.14		2	C,T	0.98	0.51	**	2	C,T	0.85	0.50	**	2	C,T	0.88	0.50	**
HfSNPs62	2	C,T	0.02	0.02		1	C	NA	NA	NA	2	C,T	0.06	0.06		2	C,T	0.17	0.19	*	1	C	0.02	0.02
HfSNPs69	2	A,G	0.19	0.21		2	A,G	0.33	0.28		2	A,T	0.49	0.37	*	2	A,G	0.73	0.47	**	2	A,G	0.19	0.41	**
HfSNPs74	2	A,G	0.04	0.04		1	A	NA	NA	NA	2	A,G	0.15	0.14		2	A,G	0.23	0.21		2	A,G	0.13	0.12
HfSNPs80	2	A,C	0.65	0.46	*	2	A,C	0.83	0.50	**	2	A,C	1.00	0.51	**	2	A,C	0.90	0.51	**	1	C	NA	NA	NA
Mean	3.0		0.21	0.26		2.9		0.22	0.30		3.0		0.30	0.34		2.9		0.30	0.33		2.8		0.21	0.27

N, observed number of alleles; S, microsatellite allelic size or SNP allele; MNA, mean number of alleles; ASR, allelic size range; Ho, observed heterozygosity; HE, expected heterozygosity; NA, monomorphic, no test done.

HWE, significantly different from Handy–Weinberg equilibrium at *P = 0.05 and **P = 0.01.

Standard diversity indices N, observed number of alleles; S, microsatellite allelic size or SNP allele; MNA, mean number of alleles; ASR, allelic size range; Ho, observed heterozygosity; HE, expected heterozygosity; NA, monomorphic, no test done. HWE, significantly different from Handy–Weinberg equilibrium at *P = 0.05 and **P = 0.01. The mean observed heterozygosity varied from 0.26 to 0.34 among biotypes for all the molecular markers together. Biotypes B and D were more inbred, and GP and E were less inbred (Fig. 7). About 37% of the combinations of SNP locus and biotype were homozygous (Supp Table 1 [online only]). Locus HfSNPa32 was uniformly homozygous. Locus HfSNPa20 is heterozygous only in Biotype B, and loci HfSNPa70 and HfSNPa81 are heterozygous only in Biotype C. All the biotypes have heterozygous individuals for loci HfSNPa50, HfSNPs45, HfSNPs50, HfSNP57, and HfSNPs69. The high frequency of homozygosity affected the use of these genetic markers to understand population structure in Hessian fly.

Fig. 7.

Mean observed heterozygosity with standard deviations.

Discussion

This investigation involved 18 SSR and 22 SNP markers in female Hessian flies. Females were used because male Hessian flies are hemizygous for the two sex chromosomes, X1 and X2; both sexes are diploid for the two autosomes, A1 and A2 (Benatti et al. 2010). Thus, somatic cells of Hessian fly have six chromosomes in males and eight in females. The germline also carries 32–45 supernumerary (E) chromosomes that are somatically eliminated during embryogenesis (Lobo et al. 2006). The contribution, if any, of the E chromosomes to genotype at the investigated loci is not known. Observed heterozygosity was generally less than expected heterozygosity for SSR loci. The means over biotypes in the last columns of Table 3 show this very clearly: observed was less than expected for 17 of the 18 SSR loci. There was less variation and lower expected heterozygosity for SNPs than for SSRs (Table 3), but there was less difference between observed and expected heterozygosity. If the map positions of SSR and SNP loci were randomly distributed on the chromosomes, this difference would be unexpected, and it would suggest that the automatic SSR scoring was undercalling heterozygotes. Small population size in culture would decrease expected heterozygosity in comparison with the wild populations from which six of the biotypes were collected, but in the absence of intrapopulation structure, small population size would not depress observed versus expected heterozygosity. Microsatellites evolve new alleles easily by change in the repeat count, but indels are also likely to be tolerated in the flanking sequence of selectively neutral, noncoding SSRs, leading to occasionally observed allelic length differences that were not an integral multiple of the motif length. Thus, the observed multiple allelism of SSR loci reflects length changes anywhere within a stretch of 20–50 or more nucleotides. On the other hand, observed variation at a SNP locus is limited to substitution or deletion of a single nucleotide. Thus, SNPs expectedly accumulate variants much more slowly than SSRs. If it were possible to examine complete sequence over an entire locus for all individuals in the population, much higher numbers of SNP alleles would be expected, and expected heterozygosity would rise in the absence of selection. There also can be bias for high polymorphism during the development of SSR markers (Brandstrom and Ellegren 2008). The lesser SNP allele count also raises, though does not answer, the question of whether the sampled SSRs are more selectively neutral than the sampled SNPs. Selection is likely to occur strongly against nonsynonymous SNP mutations within genes, whereas an intergenic SSR is likely to be neutral. The higher frequency of the most common allele (Supp Table 1 [online only]) reduced the usefulness of SNP loci to understand population structure and relationships. Thus, the SSR-based neighbor-joining tree is more plausible than the SNP-based tree, and the tree from combined data was more similar to the SSR-only tree. The statistical power of a genetic marker type to document population structure is related to the number and frequency of independent alleles (Kalinowski 2002), and in general, an individual biallelic SNP is less powerful than an individual multiply allelic SSR locus. By simulations Morin et al. (2009) showed that the statistical power of a panel of SNP loci to detect population differentiation depended upon the number of individuals sampled and the value of FST. With 20 SNP loci, 50 individuals per sample, and FST = 0.01, the power was 0.72. The Hessian fly data yielded pairwise FST values in the range 0.03–0.38 by Arlequin, and therefore the power should be greater than 0.72. Morin et al. (2009) also cite empirical evidence that 2–3.5 SNP loci match one SSR in discriminatory power, although this evidence comes from fish and mammals that have a longer generation time (years vs. weeks) than Hessian fly. Heterozygote excess and heterozygote deficiency contribute to deviation from Hardy–Weinberg equilibrium. Heterozygote excess suggests that heterozygotes have a survival advantage, perhaps even linkage to balanced lethals. Microsatellite locus HF124 deviated significantly from Hardy–Weinberg equilibrium in Biotype E, even though the heterozygotes were as frequent here as in other biotypes. Since these collections have been in culture up to 55 years (Gallun et al. 1961), biotype admixture in the laboratory must be considered as a possible cause of deviation from Hardy–Weinberg equilibrium. It is impossible to prove that admixture has or has not ever occurred given the available data, but a rapid return to equilibrium would be expected if it had occurred. Hessian fly biotype is defined by virulence on tester wheat varieties, and thus different collections of the same biotype can differ in evolutionary history as alleles conferring particular virulence patterns arise (perhaps independently) and spread horizontally among populations (in response to migration and selection as new resistance genes are deployed in wheat). The genetic basis of host-pest interaction constrains the evolution of virulence. For example, if virulence requires many incremental changes that each marginally increases survival on the host, then the evolution of a new biotype (pattern of virulence) is difficult, and a succession of increasingly virulent forms is expected as new resistance genes are deployed in the host. On the other hand, if the interaction depends upon very specific interaction of particular alleles in a few pest virulence genes with particular alleles in a few host genes, there is little reason for any particular ancestor-descendent relationship versus other relationships among the naturally occurring biotypes; a few correct mutations could convert an avirulent genotype into a highly virulent genotype. The available evidence in Hessian fly favors the latter hypothesis; virulence apparently follows the classic gene-for-gene model from plant pathology. Ratcliffe et al. (2002) found segregation for one or two dominant Hessian fly resistance loci in durum wheat lines, and similar findings in breadwheat and related wild species have resulted in the naming of several dozen H resistance genes, through at least H61. Some of these resistance genes have been mapped onto five chromosomes of wheat, 3A, 3DL, 5BS, 6D, and especially an 11-cM region on 1AS, where several named resistance genes are potentially alleles of a single locus (Wang et al. 2006, Li et al. 2015). Given four standard tester wheat varieties, 16 patterns of virulence are possible, and all exist in Hessian fly. This alone argues against the incremental evolution of progressively more comprehensive virulence. Thus, the apparently close relationship between least virulent biotype GP and most virulent biotype L in Figs. 2 and 3 is as plausible as other relationships would be. However, the implication that GP has descended from L in Fig. 4 is inconsistent with history. While geographic cohesion of related biotypes is expected if Hessian fly has evolved regional adaptation to climate, another cause of cohesion is the regional nature of wheat breeding programs, which might cause different sets of resistance genes to be deployed in different areas. The wheat varieties differ in drought tolerance, winter hardiness (for winter wheats), heading date, and so on. There is probably feedback as Hessian fly responds to selective pressure from new resistance genes and breeders respond to new patterns of virulence that initially are geographically restricted. In any case, SSR and SNP evidence presented here suggests that GP does not occupy the basal evolutionary position expected of a least virulent biotype. The simple geographic relationship shown with SSRs suggests also that the cultures have been successfully isolated from one another since their establishment.

Supplementary Data

Supplementary data are available at Journal of Insect Science online.

22 in total

1. The evolution of Hessian fly from the Old World to the New World: Evidence from molecular markers.

Authors: Brandon J Schemerhorn; Yan Ma Crane; Charles F Crane
Journal: Insect Sci Date: 2014-12-22 Impact factor: 3.262

2. A neo-sex chromosome that drives postzygotic sex determination in the hessian fly (Mayetiola destructor).

Authors: Thiago R Benatti; Fernando H Valicente; Rajat Aggarwal; Chaoyang Zhao; Jason G Walling; Ming-Shun Chen; Sue E Cambron; Brandon J Schemerhorn; Jeffrey J Stuart
Journal: Genetics Date: 2009-12-21 Impact factor: 4.562

3. Genotypic interaction between resistance genes in wheat and virulence genes in the Hessian fly Mayetiola destructor (Diptera: Cecidomyiidae).

Authors: M El Bouhssini; J H Hatchett; T S Cox; G E Wilde
Journal: Bull Entomol Res Date: 2001-10 Impact factor: 1.750

4. Biotype composition of Hessian fly (Diptera: Cecidomyiidae) populations from the southeastern, midwestern, and northwestern United States and virulence to resistance genes in wheat.

Authors: R H Ratcliffe; S E Cambron; K L Flanders; N A Bosque-Perez; S L Clement; H W Ohm
Journal: J Econ Entomol Date: 2000-08 Impact factor: 2.381

5. Localization and characterization of 170 BAC-derived clones and mapping of 94 microsatellites in the Hessian fly.

Authors: Brandon J Schemerhorn; Yan M Crane; Philip K Morton; Rajat Aggarwal; Thiago Benatti
Journal: J Hered Date: 2009-07-10 Impact factor: 2.645

Review 6. A review on SNP and other types of molecular markers and their use in animal genetics.

Authors: Alain Vignal; Denis Milan; Magali SanCristobal; André Eggen
Journal: Genet Sel Evol Date: 2002 May-Jun Impact factor: 4.297

7. Genome-wide analysis of microsatellite polymorphism in chicken circumventing the ascertainment bias.

Authors: Mikael Brandström; Hans Ellegren
Journal: Genome Res Date: 2008-03-20 Impact factor: 9.043

8. Economic impact of Hessian fly (Diptera: Cecidomyiidae) on spring wheat in Oregon and additive yield losses with Fusarium crown rot and lesion nematode.

Authors: Richard W Smiley; Jennifer A Gourlie; Ruth G Whittaker; Sandra A Easley; Kimberlee K Kidwell
Journal: J Econ Entomol Date: 2004-04 Impact factor: 2.381

9. Genomic analysis of a 1 Mb region near the telomere of Hessian fly chromosome X2 and avirulence gene vH13.

Authors: Neil F Lobo; Susanta K Behura; Rajat Aggarwal; Ming-Shun Chen; Frank H Collins; Jeff J Stuart
Journal: BMC Genomics Date: 2006-01-16 Impact factor: 3.969

10. Precisely mapping a major gene conferring resistance to Hessian fly in bread wheat using genotyping-by-sequencing.

Authors: Genqiao Li; Ying Wang; Ming-Shun Chen; Erena Edae; Jesse Poland; Edward Akhunov; Shiaoman Chao; Guihua Bai; Brett F Carver; Liuling Yan
Journal: BMC Genomics Date: 2015-02-21 Impact factor: 3.969

2 in total

1. Transcriptome Analysis and Identification of Major Detoxification Gene Families and Insecticide Targets in Grapholita Molesta (Busck) (Lepidoptera: Tortricidae).

Authors: Yanqiong Guo; Yanping Chai; Lijun Zhang; Zhiguo Zhao; Ling-Ling Gao; Ruiyan Ma
Journal: J Insect Sci Date: 2017-01-01 Impact factor: 1.857

2. Genetic Variation May Have Promoted the Successful Colonization of the Invasive Gall Midge, Obolodiplosis robiniae, in China.

Authors: Yan-Xia Yao; Xing-Pu Shang; Jun Yang; Ruo-Zhu Lin; Wen-Xia Huai; Wen-Xia Zhao
Journal: Front Genet Date: 2020-04-17 Impact factor: 4.599

2 in total