Literature DB >> 32694639

Genome-wide analysis of diamondback moth, Plutella xylostella L., from Brassica crops and wild host plants reveals no genetic structure in Australia.

Kym D Perry1,2, Michael A Keller3, Simon W Baxter4.   

Abstract

Molecular studies of population structure can reveal insight into the movement patterns of mobile insect pests in agricultural landscapes. The diamondback moth, Plutella xylostella L., a destructive pest of Brassica vegetable and oilseed crops worldwide, seasonally colonizes winter canola crops in southern Australia from alternative host plant sources. To investigate movement, we collected 59 P. xylostella populations from canola crops, Brassica vegetable and forage crops and brassicaceous wild host plants throughout southern Australia in 2014 and 2015 and genotyped 833 individuals using RAD-seq for genome-wide analysis. Despite a geographic sampling scale > 3,000 km and a statistically powerful set of 1,032 SNP markers, there was no genetic differentiation among P. xylostella populations irrespective of geographic location, host plant or sampling year, and no evidence for isolation-by-distance. Hierarchical STRUCTURE analysis at K = 2-5 showed nearly uniform ancestry in both years. Cluster analysis showed divergence of a small number of individuals at several locations, possibly reflecting an artefact of sampling related individuals. It is likely that genetic homogeneity within Australian P. xylostella largely reflects the recent colonization history of this species but is maintained through some level of present gene flow. Use of genome-wide neutral markers was uninformative for revealing the seasonal movements of P. xylostella within Australia, but may provide more insight in other global regions where the species has higher genetic diversity.

Entities:  

Mesh:

Year:  2020        PMID: 32694639      PMCID: PMC7374630          DOI: 10.1038/s41598-020-68140-w

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.996


Introduction

Mobile insect pests regularly colonize annual crops from alternative host plant sources[1-3]. Protecting crops from attack by these pests is difficult for pest managers due to the unpredictable nature of seasonal outbreaks, particularly when insecticide resistant genotypes are present[4,5]. For mobile insect pests, dispersal among crop and non-crop host plant resources influences both the seasonal dynamics and genetic background of pest populations, with direct consequences for pest management[6-9]. Molecular studies of population structure and gene flow can potentially provide insight into patterns of insect movement in agricultural landscapes[10,11]. The diamondback moth, Plutella xylostella L., is the most destructive pest of brassicaceous crops worldwide[12,13]. It attacks Brassica vegetable crops throughout tropical and temperate regions[14] and in recent decades has become a significant pest of canola crops in temperate regions[4,15-18]. The propensity of P. xylostella to evolve insecticide resistance rapidly and a lack of alternative control options has led it to evolve resistance to most pesticides[4]. Within Australia, P. xylostella has been a major pest of Brassica vegetable crops since the late 1800s[19] and a sporadic but damaging pest of canola crops since the 1990s following a dramatic expansion of canola production[18]. Approximately 3 million hectares of canola is grown annually under a Mediterranean climate in southern Australia[20], providing vast host resources for P. xylostella from crop planting in autumn to maturity in late spring. Intermittent outbreaks of P. xylostella in canola during spring cause substantial crop losses[18,21]. Commercial Brassica vegetable crops are grown continuously in horticultural areas surrounding the major urban centres in each Australian state, occupying a total area less than 1% of total canola plantings[22]. Other host plants for P. xylostella include Brassica forage crops grown during spring and summer as stock feed, and a diversity of introduced and native wild brassicaceous species distributed over vast areas and proliferated by rainfall. In Brassica vegetable crops, limited P. xylotella dispersal and intense insecticide use targeting this insect can lead to elevated levels of insecticide resistance in local P. xylostella populations[6,22,23]. In canola-growing areas, the highly seasonal availability of host plants compels P. xylostella to move regularly among crops and other brassicaceous host plants, which tends to homogenise levels of insecticide resistance[24]. Estimating gene flow among local populations of P. xylostella between host plant types and identifying source populations of P. xylostella that seasonally infest Australian canola are essential to facilitate forecasting of seasonal pest pressure and inform insecticide resistance management[13,18,19]. Various molecular markers have been used to investigate population structure in P. xylostella, including allozymes, ISSRs, microsatellites and mtDNA. Plutella xylostella populations from different continents are clearly differentiated[19,25,26], but a lack of genetic structure among populations within parts of Asia[10,27,28], the USA[29,30] and Australasia[19,31] implies regular intermixing at an intra-continental level. Many population genetic studies of P. xylostella have been difficult to interpret due to limited sampling[32] and few studies have sampled at a sufficient spatiotemporal scale or resolution to investigate movement at a landscape scale. However, two studies successfully identified the seasonal migration pathways of P. xylostella in China through extensive field sampling and analysis of both microsatellite markers and geographic variation in mtDNA haplotype frequencies[10,28]. Inferences from genetic data were corroborated by light trapping[33]. Within Australia, Endersby et al.[19] found no differentiation at six microsatellite loci among 17 populations across Australia and one from New Zealand, despite a sampling scale spanning > 5,000 km. These Australasian populations were clearly differentiated from populations collected in Asia and Africa[19]. Australian P. xylostella has low genetic diversity consistent with a founder effect[19,26,31,34,35]. Present levels of gene flow among Australian P. xylostella populations remain to be resolved because genetic homogeneity could reflect co-ancestry[18], and because the statistical power of six microsatellites to detect weak population structure was uncertain. Furthermore, inconsistent patterns of population structure reported among P. xylostella collected from eastern Australia[25,36] may reflect the presence of a cryptic species, Plutella australiana, among analysed samples[35,37]. The revolution in massively parallel sequencing technologies[38] and associated genotyping methods has facilitated genome-wide genetic marker sets and brought unprecedented resolution to questions of population structure[39,40]. Restriction-site-associated DNA sequencing (RAD-seq)[41] enables sequencing of targeted short regions across the genome, allowing simultaneous discovery and genotyping of single nucleotide polymorphisms (SNPs) in model and non-model species[42,43]. The ability to sequence orthologous regions across multiple individuals at high sequencing coverage makes it possible to confidently genotype SNPs and generate high density markers for population genetic studies[40,44]. Microsatellites remain popular for population genetic studies due to high polymorphism[45], but can be outperformed by large SNP panels in resolving population structure[46,47], with several examples in insects[48,49]. RAD-seq has genotyped thousands of SNPs in P. xylostella[50] and resolved species-level nuclear divergence between cryptic Australian Plutella species[35], suggesting potential for this method to provide insight into the movement patterns of P. xylostella. Here, we examined whether geographic, host plant-related or temporal population genetic structure exists among geographically distinct populations of P. xylostella in Australia. Samples were collected from canola crops, Brassica vegetable crops, Brassica forage crops and wild brassicaceous plant species throughout southern Australia and in two consecutive years to facilitate temporal comparisons. After molecular species identification, P. xylostella individuals were genotyped across genome-wide sites using RAD sequencing for population genetics analysis.

Results

Sample collection

Plutella species were collected from different Brassica host plants and locations throughout southern Australia in 2014 and 2015 (Fig. 1). After species identification using PCR-RFLP, 909 P. xylostella individuals from 60 locations, 32 in 2014 and 28 in 2015, were retained for analysis (Table 1). In total, 29 populations were collected from canola crops, 15 from Brassica vegetable crops, three from Brassica forage crops and 13 from brassicaceous weeds. Of these, 52 populations were collected in spring and seven in autumn. Seven locations were sampled in both 2014 and 2015 to facilitate a temporal analysis, of which five locations were Brassica vegetable crops from the major Brassica vegetable production areas in each Australian state (Fig. 1). Sex was determined for the 681 pupal individuals (82% of all individuals) but not larvae. The overall sex ratio was not different from 1:1 (364 males, 317 females, , p = 0.0717) and most populations had a reasonably balanced sex ratio (Table 1).
Figure 1

Geographic locations of 59 P. xylostella populations collected in Australia in 2014 and 2015 and sequenced using RAD-seq. Collections from different Brassica host types are represented by different colours. Canola is grown in dryland cropping areas of southern Australia, represented in grey shading.

Table 1

Summary of P. xylostella collections from Australia.

Location1Collection dateCoordinatesHost plantNo. sequenced2
Total
Boomi NSWSep-201428.76° S 149.81° ECanola1073
Ginninderra NSWOct-201535.19° S 149.05° ECanola1468
Henty NSWOct-201435.60° S 146.95° ECanola1697
Narromine NSWSep-201432.22° S 148.03° ECanola1673
Richmond NSWOct-201533.60° S 150.71° ECabbage1679
Werombi NSWNov-201433.99° S 150.64° EBrassica vegetables1054
Werombi NSWOct-201534.00° S 150.56° EKale945
Bundaberg QLDOct-201424.80° S 152.26° ECanola1273
Bundaberg QLDSep-201524.80° S 152.26° ECanola16510
Dalby QLDSep-201427.28° S 151.13° ECanola1468
Gatton QLDOct-201427.54° S 152.33° EBroccoli1465
Gatton QLDNov-201527.54° S 152.33° EBroccoli1477
Warwick QLDOct-201528.21° S 152.11° ECanola1486
Calca SAApr-201433.02° S 134.28° ESand rocket, wall rocket941
Cocata SASep-201433.20° S 135.13° ECanola1647
Colebatch SAFeb-201535.97° S 139.66° EBrassica forage1245
Cowell SASep-201433.66° S 137.16° ECanola1660
Keith SAOct-201436.09° S 140.29° ECanola1256
Littlehampton SASep-201535.06° S 138.90° EBrussels sprouts933
Loxton SASep-201434.37° S 140.72° ECanola1688
Mallala SASep-201534.38° S 138.50° ECanola1596
Millicent SAApr-201537.61° S 140.34° ECanola920
Minnipa SAOct-201532.81° S 135.16° ECanola1686
Moonaree SAAug-201431.99° S 135.87° EWard’s weed1600
Mt Hope SASep-201434.14° S 135.33° ECanola1676
Mt Hope SASep-201534.20° S 135.34° ECanola1679
Padthaway SAOct-201536.56° S 140.43° ECanola1495
Picnic Beach SAApr-201434.17° S 135.27° ESea rocket802
Redbanks SAOct-201434.49° S 138.59° ECanola1536
Southend SAApr-201537.57° S 140.12° ESea rocket1688
Tintinara SAOct-201535.97° S 139.66° EBrassica forage1688
Virginia SAOct-201434.64° S 138.54° EBroccoli1641
Virginia SASep-201534.64° S 138.54° ECabbage16105
Walkers Beach SASep-201433.55° S 134.86° ESea rocket1676
Walkers Beach SAMar-201533.55° S 134.86° ESea rocket1688
Walkers Beach SASep-201533.55° S 134.86° ESea rocket1266
Wirrabara SAOct-201432.99° S 138.31° ECanola1553
Wokurna SASep-201533.67° S 137.96° EWild radish1694
Wurramunda SAApr-201434.30° S 135.56° EVolunteer canola1697
Deddington TASNov-201441.59° S 147.44° EKale1266
Deddington TASNov-201541.59° S 147.44° ECauliflower1657
Launceston TASNov-201441.47° S 147.14° EWild mustard1697
Cowangie VICOct-201535.10° S 141.33° ECanola1575
Ouyen VICSep-201435.00° S 142.31° ECanola1595
Werribee VICOct-201437.94° S 144.73° ECauliflower1623
Werribee VICNov-201537.94° S 144.73° ECauliflower1376
Yanac VICSep-201436.06° S 141.25° ECanola1266
Boyup Brook WASep-201433.64° S 116.40° ECanola1553
Dalyup WAOct-201533.72° S 121.64° EWild radish1697
Esperance WASep-201433.29° S 121.76° ECanola1221
Esperance WAOct-201533.79° S 122.13° ECanola1578
Gingin WADec-201431.28° S 115.65° ERed cabbage16106
Kalannie WASep-201530.00° S 117.25° ECanola1688
Manjimup WADec-201434.18° S 116.23° EChinese cabbage954
Manjimup WANov-201534.18° S 116.23° EBrassica vegetables1339
Narrogin WAOct-201532.96° S 117.33° ECanola1376
Narrogin WAOct-201532.95° S 117.32° EWild radish, volunteer canola1688
Northampton WASep-201428.16° S 114.63° ECanola1694
Walkaway WASep-201428.94° S 114.83° ECanola1634

1Australian states: NSW New South Wales, QLD Queensland, SA South Australia, TAS Tasmania, VIC Victoria, WA Western Australia.

2Total includes males and females (pupae) and unknown sex (larvae)

Geographic locations of 59 P. xylostella populations collected in Australia in 2014 and 2015 and sequenced using RAD-seq. Collections from different Brassica host types are represented by different colours. Canola is grown in dryland cropping areas of southern Australia, represented in grey shading. Summary of P. xylostella collections from Australia. 1Australian states: NSW New South Wales, QLD Queensland, SA South Australia, TAS Tasmania, VIC Victoria, WA Western Australia. 2Total includes males and females (pupae) and unknown sex (larvae)

Read filtering and variant calling

RAD-seq was performed for 909 P. xylostella individuals from 60 collection locations, including 15 individuals randomly selected from different library pools and sequenced as technical duplicates to check the robustness of genotype calls. Illumina sequencing yielded 2.36 billion raw sequence reads after de-multiplexing. Following read trimming and filtering, mapping, genotype calling and hard-filtering, we excluded 50 individuals with greater than 60% missing data, which was largely due to low sequencing depth (Supplementary Fig. S1), then excluded the 15 technical duplicates and a population with only two individuals remaining. Nine individuals with unusually high levels of polypmorphism and investigated using mtDNA amplicon sequencing were found to be contaminated and were excluded. Genotyping and hard-filtering steps were then repeated for the remaining 833 individuals across 59 population samples, including 434 individuals from 31 populations collected in 2014 and 399 individuals from 28 populations collected in 2015. Hard-filtering retained 590,086 confidently-called (GQ  30) variant and invariant sites at a mean depth of 33.4 per individual, and a subset of 1,032 widely-dispersed (to avoid linkage bias) bi-allelic SNP variants at a mean depth of 34.0 per individual, for downstream analyses. In reference-aligned SNP datasets with read depth > 30, genotyping error rates are expected to be < 0.01[51]. The datasets for 2014 and 2015 were analysed separately. For the 15 technical duplicates, the VCF output from HaplotypeCaller was hard-filtered using our parameters to retain the 30 samples and a set of 1,473 widely-dispersed bi-allelic SNP variants. Principal components analysis showed that sample pairs group closely together, indicating that genotype calls were highly consistent (Supplementary Fig. S2).

Genetic diversity statistics

Population genetic diversity was estimated using the 590,086 variant and invariant sites. The mean observed heterozygosity per population averaged 0.0092 ± 0.0002 SD (range = 0.0088, 0.0097) across the 59 populations and showed little variation across populations collected from different years (2014, n = 31 and 2015, n = 28), host plant types (canola, n = 30, Brassica vegetable crops, n = 15, Brassica forage crops, n = 2, and wild brassicas, n = 12) or seasons (autumn, n = 7 or spring, n = 52). In general, observed heterozygosity was lower than expected as shown by mostly positive values, suggesting some inbreeding (Tables 2, 3, Supplementary Fig. S3). The population from Southend 2015 had reduced gene diversity and fewer private sites relative to other populations. Across the 1,032 SNPs, observed heterozygosity and gene diversity within each year showed reasonable agreement (Tables 2, 3, Supplementary Fig. S3), indicating allele frequencies at these loci are in Hardy–Weinberg proportions. Again, for this marker set, the Southend 2015 population had the lowest genetic diversity among populations, contributing to a negative value.
Table 2

Population statistics for all 590,086 confidently-called variant and invariant sites, and a subset of 1,032 hard-filtered SNP loci, for 31 P. xylostella populations collected from Australia in 2014.

PopulationAll variant and invariant sites1,032 SNP variants
NSitesSite depthSNPsIndelsPrivate sites\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {O}}$$\end{document}HO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {S}}$$\end{document}HS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{\text{IS}}$$\end{document}FISNSite depth\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {O}}$$\end{document}HO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {S}}$$\end{document}HS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{\text{IS}}$$\end{document}FIS
Boomi NSW9.5562,586388,5901,653160.00900.00950.03989.3380.20960.2057− 0.0204
Henty NSW15.3564,870338,4181,618140.00920.00960.049614.5330.20420.20520.0048
Narromine NSW15.0553,119308,2161,558180.00930.00970.038213.8310.20770.2055− 0.0081
Werombi NSW9.3550,438268,0861,518160.00950.00970.01798.4280.21200.2074− 0.0244
Bundaberg QLD11.3557,174388,3381,578160.00910.00960.045110.7380.20500.2030− 0.0105
Dalby QLD13.5567,483368,4951,630160.00930.00960.040212.9360.20950.2086− 0.0020
Gatton QLD12.9543,911287,9381,491120.00950.00960.015212.0290.21600.2030− 0.0459
Calca SA8.3546,958408,2501,588300.00950.00990.03547.5400.22080.2205− 0.0076
Cocata SA15.0553,050378,1191,560130.00930.00970.036713.9370.20140.20310.0040
Cowell SA15.1557,172328,2761,578170.00940.00980.037813.8320.21120.2094− 0.0077
Keith SA10.8532,878247,5991,434180.00970.00980.01049.3260.21720.2065− 0.0385
Loxton SA15.3564,013428,5901,639220.00910.00960.054015.0420.19650.20220.0182
Moonaree SA15.2560,304338,3541,595170.00940.00970.038514.0340.21420.2082− 0.0207
Mt Hope SA15.2560,623378,2621,593140.00920.00960.045914.3370.19860.20140.0067
Picnic Beach SA7.5550,986448,1251,561330.00970.00990.01286.4440.22330.2144− 0.0400
Redbanks SA13.2519,055367,5911,417170.00910.00960.053612.9360.20840.2056− 0.0106
Virginia SA15.3564,927328,4371,620160.00920.00970.046714.5330.20870.2063− 0.0072
Walkers Beach SA15.2560,602358,3711,599210.00910.00970.051814.8350.20020.20180.0013
Wirrabara SA13.6536,022387,8881,512130.00910.00960.054113.1380.20310.2021− 0.0032
Wurramunda SA15.3565,796418,6301,651200.00910.00950.042715.1410.20010.20300.0117
Deddington TAS11.0539,076257,7921,454170.00970.00980.01719.6260.21820.2109− 0.0292
Launceston TAS15.1557,084338,3181,602150.00930.00970.040614.1340.21100.2072− 0.0120
Ouyen VIC14.2557,715348,2461,589180.00940.00970.035013.2340.20820.2061− 0.0072
Werribee VIC15.2560,377358,4111,599170.00920.00970.050714.4350.20930.20960.0024
Yanac VIC11.6569,684398,5341,638170.00900.00960.052011.2390.19840.20440.0168
Boyup Brook WA14.4566,791358,5101,630140.00920.00960.036513.8350.20890.2031− 0.0219
Esperance WA11.2550,595338,1561,551300.00950.00970.018210.1340.21280.2069− 0.0289
Gingin WA15.2559,983358,3531,590140.00890.00960.071014.8350.19590.20140.0160
Manjimup WA8.4553,540278,1881,551140.00950.00960.01077.7280.20610.2043− 0.0173
Northampton WA15.4568,041388,5581,646160.00900.00950.054315.0380.19490.20070.0158
Walkaway WA15.2560,808388,3111,591160.00900.00960.061914.9390.19780.20260.0207

N, number of individuals genotyped per locus; , observed heterozygosity; , gene diversity; , Nei’s inbreeding coefficient.

Table 3

Population statistics for all 590,086 confidently-called variant and invariant sites, and a subset of 1,032 hard-filtered SNP loci, for 28 P. xylostella populations collected from Australia in 2015.

PopulationAll variant and invariant sites1,032 SNP variants
NSitesSite depthSNPsIndelsPrivate sites\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {O}}$$\end{document}HO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {S}}$$\end{document}HS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{\text{IS}}$$\end{document}FISNSite depth\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {O}}$$\end{document}HO\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{\text {S}}$$\end{document}HS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_{\text{IS}}$$\end{document}FIS
Goulburn NSW13.3559,574338,3181,574150.00900.00960.054512.9340.19480.19850.0153
Richmond NSW15.2560,841378,3491,595140.00890.00960.063414.9380.19830.19940.0005
Werombi NSW8.5556,788428,4731,605130.00880.00940.04388.3420.20330.2008− 0.0127
Bundaberg QLD14.5536,071257,7771,472160.00950.00970.037113.3260.21310.2065− 0.0247
Gatton QLD12.7533,431277,6741,438180.00950.00980.036511.4280.21290.2116− 0.0019
Warwick QLD13.2555,382338,3281,590200.00920.00970.046112.3330.20120.2020− 0.0015
Colebatch SA11.5565,070318,4491,612170.00910.00960.044511.0320.20230.20360.0019
Littlehampton SA8.4551,419388,4631,616190.00910.00970.04958.2380.20390.20680.0005
Mallala SA13.9545,678288,0121,526200.00930.00960.035013.1290.20900.2060− 0.0133
Millicent SA8.1532,566328,0541,539150.00890.00960.06077.8320.20200.20330.0010
Minnipa SA15.3563,734348,4331,608170.00910.00950.050614.6340.20630.2061− 0.0004
Mt Hope SA14.9551,211308,1171,542180.00930.00970.041713.4310.21640.2101− 0.0211
Padthaway SA12.6529,583307,8041,488160.00910.00960.051312.0300.20420.20610.0005
Southend SA15.3563,465348,3651,59760.00930.0091− 0.013114.4340.20990.1953− 0.0598
Tintinara SA14.6539,118257,8531,499140.00960.00970.027813.4260.20930.2026− 0.0202
Virginia SA15.4567,767358,5481,648150.00920.00960.048514.9350.20530.2058− 0.0023
Walkers Beach SA15.3564,509368,4551,627150.00910.00940.040714.8360.19680.1958− 0.0067
Walkers Beach SA11.3557,146268,2461,564150.00950.00980.027610.4270.20990.2064− 0.0137
Wokurna SA15.1558,036338,2461,579190.00940.00970.037914.0340.21200.2105− 0.0098
Deddington TAS15.3565,674408,4921,630160.00910.00960.054415.0400.20220.20440.0054
Cowangie VIC14.4565,260358,4571,612190.00920.00960.044513.7350.21000.2083− 0.0051
Werribee VIC11.9538,661257,8781,469160.00930.00970.041611.0270.20910.2050− 0.0176
Dalyup WA15.3562,709328,4521,624170.00920.00970.045914.4320.20270.20480.0017
Esperance WA13.5532,826277,7181,460200.00960.00980.023912.2280.21090.2043− 0.0262
Kalannie WA15.3564,388338,4101,614170.00910.00960.049614.6340.20460.20650.0009
Manjimup WA12.3556,387368,2741,567180.00910.00960.051712.0370.20070.20320.0095
Narrogin WA11.9541,517297,9471,512160.00940.00970.037611.1300.21150.2104− 0.0048
Narrogin WA15.1557,879348,2841,582180.00900.00960.057914.7350.19690.20110.0125

N, number of individuals genotyped per locus; , observed heterozygosity; , gene diversity; , Nei’s inbreeding coefficient.

Population statistics for all 590,086 confidently-called variant and invariant sites, and a subset of 1,032 hard-filtered SNP loci, for 31 P. xylostella populations collected from Australia in 2014. N, number of individuals genotyped per locus; , observed heterozygosity; , gene diversity; , Nei’s inbreeding coefficient. Population statistics for all 590,086 confidently-called variant and invariant sites, and a subset of 1,032 hard-filtered SNP loci, for 28 P. xylostella populations collected from Australia in 2015. N, number of individuals genotyped per locus; , observed heterozygosity; , gene diversity; , Nei’s inbreeding coefficient.

Power analysis

The power analysis indicated that our SNP marker loci had a high level of statistical power to detect even weak population structure. The 1,032 SNP loci had 100% probability of detecting true values of 0.0027 or 0.0056 (Supplementary Table S1), corresponding to the estimated global values for the 2014 and 2015 datasets.

Population differentiation

The global estimates of calculated using 1,032 SNPs were not significantly different from zero in either 2014 (, 99% CL  0.0043, 0.0107) or 2015 (, 99% CL  0.0019, 0.0138), indicating a lack of genetic differentiation among populations within years. Pairwise values were generally very low, ranging from 0.0065 to 0.0178 (mean 0.0029 ± 0.0040 SD) in 2014 and − 0.0077 to 0.0344 (mean 0.0054 ± 0.0075 SD) in 2015 (Fig. 2). After correction for multiple comparisons (2014: n = 465 comparisons, 2015: n = 365 comparisons), no pairwise values were significant at the target level, indicating a lack of genetic differentiation among P. xylostella populations collected within a single year. The highest pairwise values were associated with the Southend 2015 population, ranging from 0.0221 to 0.0344 (mean 0.0265 ± 0.0035 SD, n = 27 comparisons), indicating allele frequencies in this population were the most divergent from other populations (Fig. 2). AMOVA analysis using 1,032 SNPs indicated a lack of any spatial, temporal or host-plant related genetic structure among populations (Table 4). In model A, where populations were divided into years and Brassica host types, > 99% of variance was found within populations with negligible variance among populations explained by year or host type. Similarly, in model B, where seven locations were sampled in both years, > 99% of variance was found within populations. These results precluded interpretation of whether there was more spatial or temporal variance among populations.
Figure 2

Heat maps showing pairwise comparisons of genetic distance measured as Weir and Cockerham’s (1984) (top panels) and geographic distance in kilometres (bottom panels) among P. xylostella populations collected from Australia in 2014 (left panels) and 2015 (right panels). Within each year, populations on x and y-axes are sorted geographically from north-western to north-eastern Australia in an arc following the southern coast. Visual comparison of the and geographic distance heat maps within each year shows no congruence between genetic and geographic distance among population pairs in 2014 or 2015.

Table 4

Analysis of molecular variance under two hierarchical model structures.

AMOVA summary
SourcedfSSMSEst. var.%
Model A
Year1141.844141.8440.0150.013
Host5651.478130.2960.0490.043
Population526,487.275124.7550.9020.798
Error77486,712.706112.032112.03299.145
Total83293,993.302112.998100.000
Model B
Year1102.038102.038− 0.162− 0.154
Location121,409.908117.4920.9230.875
Error18118,947.900104.685104.68499.278
Total19420,459.846105.445100.000

In Model A, all 59 populations collected from four Brassica host types in 2014 and 2015 were analyzed and variance was partitioned among years, among host within years and among populations within host. In Model B, populations from seven locations sampled in both 2014 and 2015 were analyzed and variance was partitioned among years and among locations within years.

Heat maps showing pairwise comparisons of genetic distance measured as Weir and Cockerham’s (1984) (top panels) and geographic distance in kilometres (bottom panels) among P. xylostella populations collected from Australia in 2014 (left panels) and 2015 (right panels). Within each year, populations on x and y-axes are sorted geographically from north-western to north-eastern Australia in an arc following the southern coast. Visual comparison of the and geographic distance heat maps within each year shows no congruence between genetic and geographic distance among population pairs in 2014 or 2015. Under isolation by distance, geographic and genetic distances should be positively correlated[52]. Populations were collected across geographic distances of up to 3,756 km (Northampton WA/Bundaberg QLD) in 2014 (mean distance 1,323 ± 960 SD km) and 3,624 kilometres (Manjimup WA/Bundaberg QLD) in 2015 (mean distance 1,263 ± 917 SD km). Given the vast sampling scale, we expected higher values at greater geographic distances between population pairs, however heat maps revealed no such pattern (Fig. 2). Mantel tests confirmed a lack of genetic isolation by distance in both 2014 (Mantel’s r = 0.1136, p = 0.1316) and 2015 (Mantel’s r = − 0.0901, p = 0.8222) datasets, indicating that P. xylostella populations in close proximity or separated by thousands of kilometres were equally differentiated. Analysis of molecular variance under two hierarchical model structures. In Model A, all 59 populations collected from four Brassica host types in 2014 and 2015 were analyzed and variance was partitioned among years, among host within years and among populations within host. In Model B, populations from seven locations sampled in both 2014 and 2015 were analyzed and variance was partitioned among years and among locations within years. Population structure was explored using two different individual-based clustering approaches. First, STRUCTURE analysis was performed using the widely-dispersed 1,032 SNPs and analysing 2014 and 2015 populations separately. We first determined the predicted optimal values for K, then examined bar plots for several K values to assess hierarchical population structure. In 2014, the data most likely formed two genotypic clusters, with the delta K method and mean likelihood value both producing an optimal at K = 2 (Supplemementary Fig. S4). At this K value, bar plots showed that most individuals shared nearly uniform ancestry across the major genotypic cluster regardless of geographic location (Fig. 3). A second genotypic cluster was largely associated with three individuals from Esperance, which showed 98.7%, 98.7% and 56.5% cluster assignment, while of the remaining 396 individuals, only 17 individuals were greater than 1% (1.0 to 9.3%) admixed across this cluster. At K = 3 and K = 4, no significant additional population structure was detected, with the additional genotypic clusters associated with two individuals from Boyup Brook and two individuals from Cocata (Supplementary Fig. S5).
Figure 3

Proportional assignment to genotypic clusters, K, based on STRUCTURE analysis of P. xylostella individuals from Australia in 2014 and 2015. Individuals are represented by vertical bars and genotypic clusters are represented by different colours. Individuals collected each year were analysed separately and in both years the data most likely formed two genotypic clusters. Top panel: Analysis at K = 2 for 434 individuals collected from 31 locations in 2014. Bottom panel: Analysis at K = 2 for 399 individuals collected from 28 locations in 2015. Within years, bar plots show a high degree of genotypic admixture across individuals regardless of geographic location, as shown by sharing of blue-coloured bars, with a second genotypic cluster represented by red-coloured bars shared predominantly by several individuals at a single location.

Proportional assignment to genotypic clusters, K, based on STRUCTURE analysis of P. xylostella individuals from Australia in 2014 and 2015. Individuals are represented by vertical bars and genotypic clusters are represented by different colours. Individuals collected each year were analysed separately and in both years the data most likely formed two genotypic clusters. Top panel: Analysis at K = 2 for 434 individuals collected from 31 locations in 2014. Bottom panel: Analysis at K = 2 for 399 individuals collected from 28 locations in 2015. Within years, bar plots show a high degree of genotypic admixture across individuals regardless of geographic location, as shown by sharing of blue-coloured bars, with a second genotypic cluster represented by red-coloured bars shared predominantly by several individuals at a single location. In 2015, the delta K method produced an optimal at K = 2 and weaker secondary modes at K = 3 and K = 5 while the highest log-likelihood occurred at K = 5 (Supplementary Fig. S4). The modes at K = 3 and K = 5 indicate sub-structure in the data. At K = 2, most individuals shared nearly uniform ancestry across the major genotypic cluster regardless of geographic location (Fig. 3). The second genotypic cluster was predominantly associated with individuals from Southend, where 10 individuals showed 31.7 to 99.4% cluster assignment. At higher K values, further geographic structure was identified. At K = 3, two clusters were mainly associated with Southend (cluster A, 7 individuals with 26.1–98.6% assignment; cluster B: 10 individuals with 33.2–99.5% assignment) (Supplementary Fig. S5). At K = 4, the additional cluster was mainly associated with individuals collected from Walkers Beach in both autumn and spring 2015, showing a consistent pattern at both time points. At K = 5, the additional cluster was mostly represented by three individuals from Werombi. To further examine hierarchical structure, we reanalysed the 2015 data after removing Southend. This resulted in a weak delta K optimal at K = 3, but showed the same clustering pattern as the full 2015 dataset at K = 5 and is not presented. Individual-based PCA analysis identified clustering patterns consistent with the STRUCTURE analysis. In both years, eigenvalues for the first principle component (PC) were not strongly different from eigenvalues for other PCs, indicating no clear axis of variance in the data, and individuals across different geographic populations clustered together to a high degree (Fig. 4). In both years, PCA identified the most divergent individuals consistent with those in the STRUCTURE analysis. In 2014, three individuals from Esperance and two individuals from Cocata clustered distinctly along separate PC axes. In 2015, two groups of individuals from Southend clustered distinctly along the two PCs axes, and three individuals from Werombi formed an identifiable cluster along the vertical PC axis.
Figure 4

Principal components analysis of P. xylostella individuals collected from Australia in 2014 and 2015. Individuals are represented by small circles colour-coded by geographic population. Two populations with the most divergent individuals in each year are labelled.

Principal components analysis of P. xylostella individuals collected from Australia in 2014 and 2015. Individuals are represented by small circles colour-coded by geographic population. Two populations with the most divergent individuals in each year are labelled.

Discussion

In successful invading species, colonizing populations often exhibit reduced genetic diversity compared to their population of origin[53]. Previous molecular studies found a lack of genetic structure[19] or inconsistent patterns of genetic structure[25,36] among Australian populations of P. xylostella, and low genetic diversity implied a bottleneck during colonization[19,31]. To elucidate the movement patterns of P. xylostella in Australian canola cropping systems, we performed a comprehensive study of genetic structure among P. xylostella populations from crop and non-crop brassicaceous host plants throughout southern Australia. The study design included extensive field sampling to reflect the dispersal ecology of the species, molecular species identification, minimisation of sex bias, and a powerful SNP marker set derived from RAD-seq. Genome-wide analysis revealed a distinct lack of genetic structure among Australian P. xylostella populations, irrespective of geographic location, host plant, or sampling year. This pattern was temporally stable at seven locations sampled in both 2014 and 2015. Our findings based on SNPs were highly consistent with those based on six microsatellites[19]. Our SNP-based estimates of global within Australia were 0.0027 in 2014 and 0.0056 in 2015, compared to 0.0051 from microsatellites[19]. In both studies, > 99% of genetic variance occurred within populations with negligible variance explained by different locations or host plants. There was no evidence for isolation by distance, implying that any two populations, whether separated by distances of only several km or > 3,000 km, may be equally differentiated. Australian P. xylostella forms a single homogeneous population across bi-allelic neutral SNP markers. Cluster analysis confirmed the overall lack of genetic divergence among populations. In both years, STRUCTURE analysis identified K = 2 as the most likely number of genotypic clusters. This is a common result among studies employing the delta K method[54], because K = 1 cannot be obtained and because K = 2 often represents the top level of hierarchical population structure. At K = 2, a small number of divergent samples were identified at single geographic locations, including Esperance in 2014 and Southend in 2015. In 2015, at K values  3, two genotypic clusters occurred predominantly within the Southend population. Do these admixture patterns reflect genetic isolation? STRUCTURE sorts groups into Hardy–Weinberg linkage populations under the assumption of independent loci[55,56]. Among all populations, Southend had the lowest gene diversity, the fewest private sites and the highest values in pairwise population comparisons. Notably, this population was collected from a small and isolated patch of sea rocket consisting only of several large plants. It is likely that cluster patterns reflect an artefact of sampling related individuals at Southend[57,58]. In hierarchical STRUCTURE analysis of 2014 and 2015 samples, additional genotype clusters at successively higher K values occurred in a small number of individuals from single locations as STRUCTURE simply grouped the next most related samples at each hierarchical level. These results highlight the need for caution when samples of related individuals are present, to avoid false inferences of population structure. There is no evidence to suggest divergent samples represent interspecific hybrids of P. xylostella and the sympatric species, P. australiana. Although these species can hybridize in laboratory crosses, whether hybridization occurs in the wild is unknown[35]. Whole genome analysis of 29 Plutella individuals found no evidence for widespread introgression between these two species[59]. In our study, all Esperance and Southend individuals exhibited levels of heterozygosity across > 550,000 variant and invariant genome-wide loci that were similar to other P. xylostella. We would expect substantially higher heterozygosity if individuals were interspecific hybrids or if DNA samples were contaminated. PCA analysis of P. xylostella from Esperance 2014, Southend 2015 and five other locations, together with five P. australiana populations from Perry et al.[35], showed clear species groupings and no evidence of introgressed individuals (Supplementary Fig. S6). Genetic variation within a species is shaped by historical and contemporary evolutionary processes[7,60]. Because genetic homogeneity among Australian P. xylostella could reflect common ancestry[18], present gene flow patterns are not clear from our data. Considering the vast size of the continent, it seems unlikely that P. xylostella forms a panmictic population in Australia in the sense that interbreeding is completely random. Saw et al.[31] reported a small degree of sub-population structure among P. xylostella from 14 Australian locations based on geographic variation in frequencies of the dominant mtDNA haplotype. Insecticide resistance profiles of P. xylostella can vary spatially when the intensity of insecticide use differs across locations and host plant types, indicating that gene flow is often insufficient to overwhelm the effects of local selection on frequencies of resistance alleles[6,29]. Mo et al.[23] found that P. xylostella moves over limited distances within actively growing Brassica vegetable crops where suitable host plants are continuously available. There is little propensity to emigrate during summer when crops are irrigated and the surrounding landscape is dry. Limited movement in these crops can lead to functional isolation of sub-populations, whereby resistance is selected by the insecticide spray regimes of individual farmers. Local selection, rather than spread of resistance alleles through gene flow, largely explained variation in levels of resistance to synthetic pyrethroids among Australian P. xylostella populations[6]. Genetic differentiation indices are likely to over-estimate rates of gene flow, to the extent that genetic uniformity among P. xylostella reflects a genetic bottleneck and range expansion during colonization of Australia[61]. Although outgroups from other continents were unavailable for comparison, our recent data support the view that P. xylostella within Australia displays reduced genomic diversity compared to populations elsewhere. Australian P. xylostella, including 47 individuals from our study, were previously shown to exhibit 1.5-fold lower levels of heterozygosity across the nuclear genome relative to the endemic congeneric species, P. australiana[35]. Similarly, a study of microsatellite variation reported lower nucleotide diversity within P. xylostella populations from Australia than populations from Kenya and Indonesia[19]. Plutella xylostella exhibits lower mtDNA haplotype diversity within Australia[26,31,34,35] than within parts of Asia, Africa, Europe and North America[10,26-28,31,34,62]. Analysis of mtDNA in 102 Australian P. xylostella individuals, including 44 individuals from our study, identified only five closely-related COI haplotypes (613 bp) and two dominant shared haplotypes[35]. Reduced diversity across the genome is strong evidence for a bottleneck[63]. By contrast, a recent global study of P. xylostella whole genomes reported unexpectedly high levels of SNP diversity within populations from the Oceania region, including Australia, New Zealand, Vanuatu and Samoa, compared to populations from putative historical source regions in the Americas, Europe, Africa and Asia[64]. Whether loci under selection may explain this pattern warrants further study. Plutella xylostella within Australia and New Zealand appears to have been founded by a small number of females derived from an ancestral lineage in southern Asia[26,31,34,35,64]. It is likely that genetic homogeneity across the Australian P. xylostella distribution is maintained by some level of ongoing gene flow. RAD-seq markers revealed weak but significant genetic differentiation among field and laboratory-reared Australian P. xylostella populations ( to )[50] but not wild populations (, this study), implying that intermixing prevents divergence. Neutral bi-allelic SNPs failed to reveal the scale, frequency and timing of gene flow. Even very few migrants per generation can eliminate genetic differentiation among populations[65,66], especially where genetic diversity is low. If a small founding P. xylostella population originally colonized Australia[31], its present distribution throughout Australia demonstrates past gene flow at a continental level. This is consistent with the wide distribution of two shared haplotypes and with P. xylostella being a migratory species[4,13]. Within Australia, there is ample indirect evidence of P. xylostella dispersal, including the seasonal widespread colonization of winter-grown canola crops[18,67] and detection of moth flights in light trapping and pheromone trapping studies[19,37,67-69]. In canola-growing areas remote from Brassica vegetable production, the annual canola cropping cycle of crop planting, senescence and harvest forces P. xylostella to disperse regularly between crops and wild brassicaceous host plants in the landscape. This tends to homogenise the insecticide resistance profiles of P. xylostella across Australian canola-growing regions, and among canola crops and Brassica weeds within each region[24]. Gene flow creates potential for the spread of resistance alleles within and among Australian Brassica cropping systems. For certain newer insecticide chemistries, elevated resistance levels occur in P. xylostella within intensively sprayed Brassica vegetable crops relative to insects from canola crops or weeds[6]. Emigration of resistant moths from these cropping areas could contribute to the risk of P. xylostella insecticide resistance in canola cropping systems. Conversely, seasonal flights of P. xylostella moths into Brassica vegetable crops during spring[32,68], perhaps originating from senescing canola crops or weeds, could dilute resistance levels if immigrants are more susceptible, and provide opportunities for rotation strategies to manage resistance. Insecticide-resistant P. xylostella genotypes can persist locally in Australian canola-growing areas where summer-active brassicas occur[67]. Evidence for large-scale gene flow among Australian P. xylostella suggests insecticide resistance alleles arising in one location could readily spread to other locations. These alleles may be selected to a high frequency in local areas if insecticides are used repeatedly[70]. Neutral genome-wide SNPs were uninformative in identifying the dispersal patterns of P. xylostella in Australia, confirming previous conclusions[19]. Whether larger marker sets derived from massively parallel sequencing can provide insights into seasonal migration of P. xylostella in other global regions, where the species displays higher genetic diversity, remains to be evaluated.

Methods

Immature life stages (larvae or pupae; rarely, eggs) of Plutella species were collected between March 2014 and December 2015 throughout agricultural areas of southern Australia (Fig. 1). The dryland cropping areas of Australia experience climatic conditions most likely to support year-round persistence of P. xylostella[67,71], and therefore represent the gene pool. Four brassicaceous P. xylostella host plant types were sampled: canola crops, Brassica vegetable and forage crops, and wild brassicaceous species (Fig. 1). The wild species were wild radish, Raphanus raphanistrum, turnip weed, Rapistrum rugosum, sea rocket, Cakile maritima, Ward’s weed, Carrichtera annua, and mixed stands of sand rocket, Diplotaxis tenuifolia and wall rocket, D. muralis (Table 1). In both years, most sampling was temporally restricted to periods in autumn (March to April) and spring (September to October) to minimise potential for migration to affect genetic structure[7]. Spring sampling corresponds to population peaks of P. xylostella in crops while autumn sampling corresponds to population troughs at the end of the summer/autumn non-cropping period, when host plants and the insect are often locally rare. Several locations were sampled in both years to allow temporal comparisons. At each location,  25 individuals were collected from multiple plants across a representative area to minimise sampling of related individuals, either using a sweep net in canola crops, Brassica forage crops and wild host plants, by hand in Brassica vegetable crops, or by beating plants over a collection tray for sea rocket. To eliminate parasitised individuals, each population was reared separately in a ventilated plastic container on leaves of the original host plant for 1–2 days and thereafter on cabbage leaves. Non-parasitised pupae or late-instar larvae were fresh frozen at − 80 °C.

DNA isolation and species identification

For each population, 16 individuals were sequenced where possible after removing parasitised individuals. To avoid biases due to sex-linked markers[72], we visually determined the sex of individual pupae (but not larvae) by examining external genital morphology[73] under a dissecting microscope, then male and female individuals were selected to achieve a balanced sex ratio within each population where possible. Genomic DNA was isolated by homogenising whole individuals using a TissueLyser II (Qiagen) followed by two phenol and one chloroform extractions according to Zraket et al.[74]. DNA was treated with RNase A, then precipitated and resuspended in TE buffer. To distinguish P. xylostella from P. australiana, species identification was performed using a PCR-RFLP assay[35] and P. xylostella individuals were retained for analysis.

RAD-seq library preparation and sequencing

Libraries were prepared for restriction-site-associated DNA sequencing (RAD-seq) according to a protocol modified from Baird et al.[41] as described in Perry et al.[35]. Genomic DNA was quantified using a Qubit 2.0 fluorometer (Invitrogen) and 200ng digested with 10 units of high fidelity Sbf1 in Cutsmart buffer (NEB) for 1 h at 37 °C, then heat inactivated at 80 °C for 20 min. One microlitre of P1 adapter (100nM) with a 6-base molecular identifier (MID) (top strand 5-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGxxxxxxTGCA-3, bottom strand 5-[P]xxxxxxCTGTCTCTTATACACATCTGACGCTGCCGACGA-3, x represents sites for MIDs) were then added using 0.5L T4 DNA ligase (Promega), 1nM ATP and Cutsmart buffer. Sixteen individuals with unique P1 adapters were pooled per library. To minimise sequencing biases or batch effects, individuals from each population were randomised across 2–4 (usually 4) libraries and each library was sequenced across 2–4 sequencing lanes. Library pools were sheared using a Bioruptor sonicator (Diagenode), ends repaired using a Quick Blunting Kit (NEB), adenine overhangs added then P2 adapters (top strand 5-[P]CTGTCTCTTATACACATCTCCAGAATAG-3, bottom strand 5-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGT-3) ligated, then (majority of libraries) size-selected (300–700 bp) on agarose gel to remove primer dimer. DNA purification between steps was performed using a MinElute PCR purification kit (Qiagen). Library amplification was performed using KAPA HiFi Hotstart Readymix (Kapa Biosystems) and Nextera i7 and i5 indexed primers with PCR conditions as described in Perry et al.[35]: 95 °C for 3 min, two cycles of 98 °C for 20 s, 54 °C for 15 s, 72 °C for 1 min, then 15 cycles of 98 °C for 20 s, 65 °C for 15 s, 72 °C for 1 min followed by a final extension of 72 °C for 5 min. Libraries were size-selected (300–700 bp) on agarose gel and purified using a minElute Gel Extraction Kit (Qiagen). Illumina paired-end sequencing was performed across seven lanes using HiSeq2500 (100 bp) or NextSeq500 (75 bp) at the Australian Genome Research Facility (AGRF). Additionally, 16 individuals from a separate sequencing run as described in Perry et al.[50] were included in downstream analysis. Sequence read quality was examined using FastQC[75]. As Nextseq reads had low quality base calls within restriction sites (a common problem when using fixed-length MIDs on this platform, which cause low sequence diversity and cluster signal in this region), we opted to remove restriction sites from all reads for downstream analysis. Sequence reads were de-multiplexed using RADtools version 1.2.4[42] allowing one base MID mismatch, then TRIMMOMATIC v0.32[76] was used to remove restriction sites, adapter sequences, a thymine base from reverse reads introduced by the P2 adapter, and quality filter using the ILLUMINACLIP tool with parameters: TRAILING:10 SLIDINGWINDOW:4:15 MINLEN:40. Paired reads were aligned to the P. xylostella reference genome (accession number: GCF_000330985.1) using STAMPY version 1.0.21[77] with –baq and –gatkcigarworkaround options and expected substitution rate set to 0.03 to reflect our expectations of sequence divergence from the reference strain. Duplicate reads were removed and individual sample BAM files merged using PICARD version 1.71[78]. Genotypes were jointly called for all individuals using the Genome Analysis Tool Kit version 3.3-0[79,80] HaplotypeCaller tool. We determined that base quality score recalibration using bootstrapped SNP databases was inappropriate for this dataset as it globally reduced quality scores. The variant call set was hard-filtered using VCFtools version 0.1.12a[81]. After iteratively testing multiple filtering parameter sets, we removed indels and retained confidently called bi-allelic SNPs (GQ  30) genotyped in at least 80% of individuals with a minimum genotype depth of 5, minQ  400, average site depth of 12–100, minimum minor allele frequency of 0.01 and in Hardy–Weinberg equilibrium at an alpha level of 0.05. To avoid closely-linked sites, we retained only SNPs separated by a minimum of 2,000 bp using the VCFtools—thin function. In order to estimate population-level genetic diversity, from the output of GATK HaplotypeCaller we generated a set of all confidently-called (GQ  30) variant and invariant sites and hard filtered to remove sites within repetitive regions and retain sites genotyped in at least 80% of individuals with an average site depth of 12–100. The filtered VCFs were converted to other file formats for downstream analysis using PGDSpider version 2.1.1.2[82] and custom R scripts[83].

Genetic diversity

The R package hierfstat[84] was used to calculate within-population gene diversity (, observed heterozygosity () and the inbreeding coefficient () according to Nei[85]. Population means for site depth and number of SNPs, indels and private sites were calculated using the –depth function and vcfstats module in VCFtools version 0.1.12a[81]. To examine population differentiation, a global estimate of [86] with bootstrapped 99% confidence intervals ( bootstrap iterations) was calculated in R package diveRsity[87]. Pairwise values for all population pairs were calculated and significance of differentiation determined using exact G tests ( MCMC burnins, batches, iterations per batch) in GENEPOP v4.6[88] after correction for multiple comparisons using the Bonferroni–Holm correction method[89,90]. Isolation by distance among populations[52] was investigated separately for 2014 and 2015 datasets. We used R[91] to construct heat maps and visually inspected the congruence between pairwise matrices of untransformed geographic distances in kilometres and genetic distances, , for corresponding population pairs. Significance of the regressions of pairwise linearized genetic distances[92] onto log-transformed geographic distances was determined using a Mantel test with permutations in R package ade4 version 1.7-6[93]. Geographic distances were calculated using R package geosphere version 1.5-7[94]. Analysis of molecular variance (AMOVA) was performed using the pegas implementation in R package poppr version 2.7.1[95]. The data were analysed under two hierarchical model structures. In model A, all individuals were analysed together and populations were grouped into sampling years and Brassica host types. In Model B, a temporal analysis was performed for locations sampled in both 2014 and 2015, to investigate whether variance was greater among years within locations or vice versa.

Population structure

Two individual-based clustering approaches were used to investigate population structure. First, Bayesian clustering was implemented in the program STRUCTURE version 2.3.4[55]. Variant data were converted from VCF to STRUCTURE file format using PDGSpider version 2.1.1.2[82]. For all runs, we used a burnin length of followed by a run length of 10 MCMC iterations and performed fifteen independent runs for each K value, where K is the number of genotypic clusters, using a different random seed for each run, assuming the locprior model with correlated allele frequencies and set to 1. As preliminary runs showed that most structure was identified at low K values, we analysed K-values from 1 to 10 in both years. The optimal value of K was estimated using the delta K method[96] implemented in STRUCTURE HARVESTER[97] and inspection of the likelihood distribution for each model. Q-matrices were aligned using CLUMPP version 1.1.2[98] and visualised using DISTRUCT version 1.1[99]. To further explore clustering, we performed individual-based principal components analysis (PCA) separately for 2014 and 2015 datasets using R package adegenet version 2.0.1[100,101], using scaled and centred allele frequencies and imputing missing data by taking the mean of population allele frequencies. The statistical power of the SNP marker set to detect population structure was assessed using POWSIM version 4.1[102]. This program allows the user to test the likelihood of loci of detecting genetic differentiation for pre-defined values of . For the dataset, 1,000 simulations were performed over a range of values from 0.001 to 0.01 assuming an effective population size of 5,000. The number of subpopulations, sample sizes and allele frequencies from our data were used and the generations of drift varied to achieve the target . As POWSIM currently handles a maximum of 30 populations, for the 2014 dataset the number of subpopulations was set to this value. The null hypothesis of genetic homogeneity was tested using Fisher’s exact test and a Chi-square test.

Accession codes

RAD sequences are available from the Sequence Read Archive under accession PRJNA471964. Supplementary Information.
  58 in total

1.  Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study.

Authors:  G Evanno; S Regnaut; J Goudet
Journal:  Mol Ecol       Date:  2005-07       Impact factor: 6.185

2.  Isolation by Distance.

Authors:  S Wright
Journal:  Genetics       Date:  1943-03       Impact factor: 4.562

3.  Microsatellites reveal a lack of structure in Australian populations of the diamondback moth, Plutella xylostella (L.).

Authors:  N M Endersby; S W McKechnie; P M Ridland; A R Weeks
Journal:  Mol Ecol       Date:  2006-01       Impact factor: 6.185

4.  adegenet 1.3-1: new tools for the analysis of genome-wide SNP data.

Authors:  Thibaut Jombart; Ismaïl Ahmed
Journal:  Bioinformatics       Date:  2011-09-16       Impact factor: 6.937

5.  Sex matters in massive parallel sequencing: Evidence for biases in genetic parameter estimation and investigation of sex determination systems.

Authors:  Laura Benestan; Jean-Sébastien Moore; Ben J G Sutherland; Jérémy Le Luyer; Halim Maaroufi; Clément Rougeux; Eric Normandeau; Nathan Rycroft; Jelle Atema; Les N Harris; Ross F Tallman; Spencer J Greenwood; Fraser K Clark; Louis Bernatchez
Journal:  Mol Ecol       Date:  2017-07-24       Impact factor: 6.185

Review 6.  Biology, Ecology, and Management of the Diamondback Moth in China.

Authors:  Zhenyu Li; Xia Feng; Shu-Sheng Liu; Minsheng You; Michael J Furlong
Journal:  Annu Rev Entomol       Date:  2015-12-11       Impact factor: 19.686

7.  Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction.

Authors:  Zhian N Kamvar; Javier F Tabima; Niklaus J Grünwald
Journal:  PeerJ       Date:  2014-03-04       Impact factor: 2.984

8.  RAD sequencing resolves fine-scale population structure in a benthic invertebrate: implications for understanding phenotypic plasticity.

Authors:  David L J Vendrami; Luca Telesca; Hannah Weigand; Martina Weiss; Katie Fawcett; Katrin Lehman; M S Clark; Florian Leese; Carrie McMinn; Heather Moore; Joseph I Hoffman
Journal:  R Soc Open Sci       Date:  2017-02-08       Impact factor: 2.963

9.  Gene flow in the green mirid, Creontiades dilutus (Hemiptera: Miridae), across arid and agricultural environments with different host plant species.

Authors:  J P Hereward; G H Walter; P J Debarro; A J Lowe; C Riginos
Journal:  Ecol Evol       Date:  2013-02-25       Impact factor: 2.912

10.  Effects of a sex-ratio distorting endosymbiont on mtDNA variation in a global insect pest.

Authors:  Ana M Delgado; James M Cook
Journal:  BMC Evol Biol       Date:  2009-03-03       Impact factor: 3.260

View more
  2 in total

1.  Whole genome comparisons reveal panmixia among fall armyworm (Spodoptera frugiperda) from diverse locations.

Authors:  Katrina A Schlum; Kurt Lamour; Caroline Placidi de Bortoli; Rahul Banerjee; Robert Meagher; Eliseu Pereira; Maria Gabriela Murua; Gregory A Sword; Ashley E Tessnow; Diego Viteri Dillon; Angela M Linares Ramirez; Komivi S Akutse; Rebecca Schmidt-Jeffris; Fangneng Huang; Dominic Reisig; Scott J Emrich; Juan Luis Jurat-Fuentes
Journal:  BMC Genomics       Date:  2021-03-12       Impact factor: 3.969

2.  Characterization of a new strain of Metarhizium novozealandicum with potential to be developed as a biopesticide.

Authors:  Laura F Villamizar; Gloria Barrera; Mark Hurst; Travis R Glare
Journal:  Mycology       Date:  2021-06-18
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.