Literature DB >> 35975224

Reliability and Utility of Standard Gene Sequence Barcodes for the Identification and Differentiation of Cyst Nematodes of the Genus Heterodera.

Daniel C Huston1, Manda Khudhir1, Mike Hodda1.   

Abstract

Difficulties inherent in the morphological identification of cyst nematodes of the genus Heterodera Schmidt, 1871, an important lineage of plant parasites, has led to broad adoption of molecular methods for diagnosing and differentiating species. The pool of publicly available sequence data has grown significantly over the past few decades, and over half of all known species of Heterodera have been characterized using one or more molecular markers commonly employed in DNA barcoding (18S, internal transcribed spacer [ITS], 28S, coxI). But how reliable are these data and how useful are these four markers for differentiating species? We downloaded all 18S, ITS, 28S, and coxI gene sequences available on the National Center for Biotechnology Information (NCBI) database, GenBank, for all species of Heterodera for which data were available. Using a combination of sequence comparison and tree-based phylogenetic methods, we evaluated this dataset for erroneous or otherwise problematic sequences and examined the utility of each molecular marker for the delineation of species. Although we find the rate of obviously erroneous sequences to be low, all four molecular markers failed to differentiate between at least one species pair. Our results suggest that while a combination of multiple markers is best for species identification, the coxI marker shows the most utility for species differentiation and should be favored over 18S, ITS, and 28S, where resources are limited. Presently, less than half the valid species of Heterodera have a sequence of coxI available, and only a third have more than one sequence of this marker.
© 2022 Huston et al. published by Sciendo.

Entities:  

Keywords:  detection; diagnosis; genetics; molecular biology; systematics; taxonomy

Year:  2022        PMID: 35975224      PMCID: PMC9338711          DOI: 10.2478/jofnem-2022-0024

Source DB:  PubMed          Journal:  J Nematol        ISSN: 0022-300X            Impact factor:   1.481


Diagnoses of important plant pests are increasingly reliant on molecular gene sequence data. This is especially the case for plant-parasitic nematodes, a group for which traditional taxonomic expertise is in decline (Coomans, 2002; Eyualem-Abebe et al., 2006; Oliveira et al., 2011). Collectively, plant-parasitic nematodes are estimated to cause up to 15% loss of the total global crop production, valued at over 100 billion USD annually (Koenning et al., 1999; Abad et al., 2008; Nicol et al., 2011; Singh et al., 2013, 2015; Phani et al., 2021). The great diversity and species richness of plant-parasitic nematodes and their broad range of host associations and life cycle strategies (e.g., Procter, 1984; Boag and Yeates, 1998; Siddiqi, 2000; Oliveira et al., 2011; Palomares-Rius et al., 2014; Salas et al., 2022) mean that knowledge of species-level biology and life history is often needed for effective control strategies. Therefore, the ability to distinguish between closely related species is of critical importance. The tylenchid family Heteroderidae Filipjev & Schuurmans Stekhoven, 1941, includes seven genera of “cyst nematodes,” a monophyletic lineage of sedentary plant parasites united primarily by the form taken by adult females at the end of their life cycle, a hardened sac containing embryonated eggs (Luc et al., 1986; Baldwin, 1992; Subbotin et al., 2001, 2010a, 2010b; Bert et al., 2008). The largest genus, Heterodera Schmidt, 1871, comprises about 85 species, many of which are devastating pests of important crops, including cereals, legumes, vegetables, and a wide variety of other crops (Nicol et al., 2007; Subbotin et al., 2010a, 2010b; Toumi et al., 2013; Smiley et al., 2017). The taxonomy of Heterodera is complex and morphological identification can be difficult as adults are sexually dimorphic; differences between many species are subtle and multiple life cycle stages are often required for accurate species identification and delineation (Subbotin et al., 2010b). Furthermore, a few species cannot be distinguished from one another morphologically at any life cycle stage (e.g., Subbotin et al., 2002). Thus, molecular sequence data have become an important part of diagnoses in this group. Molecular identification of species of Heterodera initially relied on polymerase chain reaction (PCR) restriction fragment length polymorphism profiles (PCR-RFLP) (Waeyenberge et al., 2009). This methodology has largely been superseded by DNA sequencing, with species being characterized primarily with four markers commonly employed in DNA barcoding: the small subunit ribosomal RNA (18S rDNA), internal transcribed spacer (ITS; comprising ITS1-5.8S-ITS2), large subunit ribosomal RNA (28S rDNA), and the mitochondrial cytochrome oxidase subunit one (coxI) gene regions (e.g., Szalanski et al., 1997; Subbotin et al., 2000, 2010a, 2010b, 2018; Ferris et al., 2004; Mundo-Ocampo et al., 2008; Escobar-Avila et al., 2018; Powers et al., 2019). The number of sequences for species of Heterodera uploaded to the public database GenBank has accumulated rapidly since the late 1990s to early 2000s when these data first began to become available (e.g., Ferris et al., 1993; Szalanski et al., 1997; Subbotin et al., 2000, 2001). There have been some reports, however, of species pairs which cannot be differentiated using sequences of one or more of these genes (Subbotin et al., 2000, 2001, 2018; Waeyenberge et al., 2009; Vovlas et al., 2015; Sekimoto et al., 2017; Escobar-Avila et al., 2018). As we move into the genomic era, with all its implications for more rapid and better identification methodologies, it seems pertinent to assess the accuracy and utility of the pool of barcoding gene sequences that have accumulated over the last few decades as these data will undoubtedly be incorporated into the next generation of molecular diagnostic tools. Here, we evaluate these barcoding data with aims to identify the availability of sequences for species of the genus as a whole and across regions, the reliability of these data in terms of erroneous or otherwise unreliable sequences, and the utility of the 18S, ITS, 28S, and coxI markers for accurate species identification and delineation.

Materials and Methods

Analyses here are based on sequence data obtained from the NCBI public database, GenBank, up to 10 August 2021. We downloaded all available partial or complete sequences of the 18S rDNA, ITS, 28S rDNA, and coxI gene regions for all named species of Heterodera. We note that ITS is situated between the 18S and 28S genes and comprises the ITS1, 5.8S, and ITS2 genes; we considered a sequence as “ITS” if it included a partial fragment or complete sequence of one of the latter three genes. We searched for all valid species individually and also searched using junior synonyms. We created a database of these sequences where each line of data records the GenBank accession number, species, gene region, geographic collection information, and sequence author(s) or publication reference. In those cases where some of this information was not included with the sequence record on GenBank, we sought it from the publication listed and/or through web searches of scholarly literature aiming to identify if the GenBank accession number in question had been published or referenced. Where collection locality could not be determined and where possible, collection locality was inferred based on the institutional addresses of the author(s) listed in the respective sequence record. Base statistics were calculated in Microsoft Excel, and a map depicting the geographic spread of available sequences was generated using ggplot2 (Wickham, 2009) in R (https://www.R-project.org) and edited in Adobe Illustrator CS6.

Assessment of sequence reliability

We evaluated whether publicly available sequence data of the four markers selected were reliably assigned to the correct species of Heterodera, by determining the number of sequences labeled as a particular species of Heterodera, which were clearly not, or could not reliably be determined to be that species. To do this, we constructed individual sequence files for each species/gene marker combination represented in our database and added a sequence of Globodera pallida (Stone, 1973) Behrens, 1975 and Globodera rostochiensis (Wollenweber, 1923) Skarbilovich, 1959 to each file to serve as outgroup taxa. Each sequence file was aligned using MUSCLE (Edgar, 2004) as implemented in MEGA X (Kumar et al., 2018), except where there were less than five sequences of a particular marker for a species. Alignments were not trimmed or restricted to specific regions of the gene under analysis. Neighbor-joining trees based on each alignment file were constructed in MEGA X with the following parameters: 100 bootstrap replications, the number of differences model, inclusion of substitutions and transversions, uniform rates among sites, and pairwise deletion of gaps and missing data. Trees were examined by eye for clear outliers and potentially problematic sequences. Problematic sequences were defined as those that diverged significantly from putative congeners in the neighbor-joining trees generated. Such sequences were added to a new database and evaluated through reexamination of alignments and comparison against GenBank using BLAST (Altschul et al., 1990) to determine the source of observed divergence. Where there were less than five sequences of a particular marker for a species, sequences were compared directly against the GenBank database using BLAST. Divergence from other sequences attributed to the same, and other species was recorded.

Assessment of sequence utility

The second aim of our study was to assess the power of each of the four molecular markers for accurate delineation of species. For this, we first generated four new sequence files, each including all sequences of Heterodera for each respective marker as above. Problematic sequences identified in our initial assessment of reliability were excluded, and sequences of G. pallida and G. rostochiensis were added as outgroup taxa as above. Alignments for each dataset were constructed using MAFFT via the online service (Katoh et al., 2019) and examined by eye for additional problematic sequences which would impede tree-based phylogenetic methods; such sequences were added to the database of problematic sequences and removed from their respective sequence files. Sequences were then re-aligned. Neighbor-joining trees were constructed for each dataset in MEGA X as above and examined by eye for clades which included sequences of two or more putatively different species that were poorly or not differentiated from one another. New data files were created for the sequences from each of these ambiguous species pairs or groups with sequences of Globodera spp. added as outgroup taxa. These ambiguous sequence datasets were aligned using MUSCLE and neighbor-joining trees as above and maximum likelihood trees using 100 bootstrap replications, and the general time reversible model with uniform rates among sites were also computed in MEGA X. Intraspecific and interspecific variation were further examined using pairwise comparison tables generated in MEGA X.

Data availability

Our sequence databases and associated data, along with sequence FASTAs, alignments, trees, and other files are all made publicly available on the Commonwealth Scientific and Industrial Research Organisation (CSIRO) Data Access Portal.

Results

General results

Our sequence database includes 2,737 entries, comprising 77, 1,723, 345, and 592 sequences of the 18S, ITS, 28S, and coxI gene regions, respectively. Of the 2,737 sequences in our database, only 1,669 (61%) could be associated with a paper published in academic or other technical literature. Altogether, sequences were available for 66% of valid species of Heterodera (57 out of 87, based on our count at the time of writing; see Li et al., 2020; Hodda, 2022). Eight species were represented by just a single sequence, and for seven of these, this was of the ITS gene region. Only 13 species had at least one sequence of each marker, and only 11 species had more than a single sequence of each marker (Table 1; Supplementary Table 2).
Table 1

Numbers of species of Heterodera for which one or more gene sequences of one or more molecular markers (18S, ITS, 28S, and coxI) are available on GenBank.

Marker
Species of Heterodera18SITS28ScoxIAll
Species with at least one sequence1855363813
Species with more than one sequence1338222911
Table 2

Species pairs of Heterodera which are inadequately or poorly distinguished from one another using one or more of the standard molecular markers evaluated (18S, ITS, 28S, coxI), including minimum base pair differences observed between sequences of problematic species pairs.

Species groupSpecies pairGene regionMinimum base pair differenceDelineation powerNotes
Avenae H. avenae–H. filipjevi 18S1Inadequate
Avenae H. avenae–H. hordecalis 18S4WeakH. hordecalis not monophyletic
in NJ/ML analyses; potentially
distinguishable from closely
related species in isolated
analyses.
Avenae H. avenae–H. mani 18S1Inadequate
Avenae H. filipjevi–H. hordecalis 18S1Inadequate
Avenae H. filipjevi–H. mani 18S2Inadequate
Avenae H. hordecalis–H. mani 18S3WeakSee above note.
Avenae H. arenaria–H. avenae ITS0Inadequate
Avenae H. avenae–H. pratensis ITS0Inadequate
Avenae H. avenae–H. australis ITS2WeakAlthough only two bp different,
sequences of H. australis form
clade with some sequences
of “H. avenae” which have
been shown to be H. australis
(see Subbotin et al., 2018).
Avenae H. avenae–H. mani ITS2WeakDespite only two bp difference,
sequences of H. mani form
monophyletic clade to
exclusion of other closely
related sequences.
AvenaeH. aucklandica–H.28S1InadequateOnly one 28S sequence of
avenae H. aucklandica available for
comparison.
AvenaeH. aucklandica–H.28S1InadequateTwo 28S sequences of
hordecalis H. hordecalis 9 bp different
AvenaeH. aucklandica–H.28S1InadequateOnly one 28S sequence of
pratensis H. pratensis available for
comparison.
Avenae H. avenae–H. pratensis 28S0inadequateSee above note.
Avenae H. avenae–H. hordecalis 28S0InadequateTwo 28S sequences of
H. hordecalis 9 bp different
Avenae H. arenaria–H. avenae cox12WeakDespite only two bp
difference, H. arenaria forms
monophyletic clade within
H. avenae group.
Cyperi H. elachista–H. oryzae ITS1Inadequate
Cyperi H. elachista–H. oryzae 28S0Inadequate
Goettingiana H. carotae–H. cruciferae ITS1Inadequate
GoettingianaH. goettingiana–H.28S0InadequateTwo 28S sequences of
microulae H. goettingiana are 19 bp
different from one another; one
potentially misidentified
Goettingiana H. carotae–H. urticae 28S2WeakOnly one 28S sequence
of H. urticae available for
comparison.
Goettingiana H. carotae–H. cruciferae 28S2WeakOnly one 28S sequence of
H. cruciferae available for
comparison (two on GenBank
but one appears to actually be
H. schachtii).
Goettingiana H. carotae–H. cruciferae cox11InadequateThree cox1 sequences
available for H. cruciferae, one
15 bp different from others.
Schachtii H. betae–H. glycines 18S2WeakH. glycines forms
monophyletic clade to
exclusion of H. betae,
H. schachtii & H. trifolii in ML,
but not NJ, analyses.
Schachtii H. betae–H. schachtii 18S0Inadequate
Schachtii H. betae–H. trifolii 18S0Inadequate
Schachtii H. schachtii–H. glycines 18S2WeakSee above note.
Schachtii H. schachtii–H. trifolii 18S0Inadequate
Schachtii H. trifolii–H. glycines 18S2WeakSee above note.
Schachtii H. betae–H. daverti ITS0Inadequate
Schachtii H. betae–H. schachtii ITS0Inadequate
Schachtii H. betae–H. trifolii ITS0Inadequate
Schachtii H. ciceri–H. schachtii ITS2Inadequate
Schachtii H. ciceri–H. trifolii ITS1Inadequate
Schachtii H. daverti–H. schacthii ITS0Inadequate
Schachtii H. daverti–H. trifolii ITS0Inadequate
SchachtiiH. glycines–H.ITS1Inadequate
medicaginis
Schachtii H. schachtii–H. trifolii ITS0Inadequate
Schachtii H. betae–H. daverti 28S1Inadequate
Schachtii H. betae–H. schachtii 28S2Inadequate
Schachtii H. betae–H. trifolii 28S1Inadequate
Schachtii H. daverti–H. schachtii 28S3Weak
Schachtii H. daverti–H. trifolii 28S1Inadequate
Schachtii H. schachtii–H. trifolii 28S1Inadequate
Numbers of species of Heterodera for which one or more gene sequences of one or more molecular markers (18S, ITS, 28S, and coxI) are available on GenBank. Species pairs of Heterodera which are inadequately or poorly distinguished from one another using one or more of the standard molecular markers evaluated (18S, ITS, 28S, coxI), including minimum base pair differences observed between sequences of problematic species pairs. Sequence data were attributed to specimens collected from 59 countries, from all continents except Antarctica (Fig. 1). More than half came from three countries: China (937), Turkey (272), and the USA (254), while 19 countries had less than five sequences attributed to them, including Afghanistan, Chile, Côte d’Ivoire (Ivory Coast), and Qatar, each with just one record (Supplementary Table 1). Notably, very few sequences originated from sub-Saharan Africa and only a single sequence originated from South America. Iran had sequences attributed to the most species (19), followed by the USA (18), China (16), and Germany (15) (Supplementary Table 2).
Figure 1

Geographic origin of 18S, ITS, 28S, and coxI gene sequences analyzed in the present study. The size of the circle reflects the total number of sequences of all markers centered on the country of origin.

Geographic origin of 18S, ITS, 28S, and coxI gene sequences analyzed in the present study. The size of the circle reflects the total number of sequences of all markers centered on the country of origin.

Assessment of reliability

The overall rate of erroneous sequences was low, with only 55 of the total 2,737 sequences (2%) being detected as potentially problematic. Of these 55 sequences, 25 were accurate but had been uploaded running in the negative strand (3¢ to 5¢) and required reverse complementing before they could be aligned with congeners. The remaining 30 sequences were considered truly erroneous or otherwise unreliable. Issues appeared to include poor-quality sequencing results and/or sequence editing errors, sequences of Heterodera uploaded under the wrong species name (e.g., Fig. 2), misidentification of related nematodes as Heterodera, and clear contamination (including a sequence of a fungus and cucumber).
Figure 2

Neighbor-joining tree based on coxI sequences of Heterodera avenae from the NCBI database GenBank showing a clade of mislabeled sequences.

Neighbor-joining tree based on coxI sequences of Heterodera avenae from the NCBI database GenBank showing a clade of mislabeled sequences. Each of the four molecular markers evaluated could not differentiate at least one species pair (Table 2). In many cases, putatively distinct species shared identical sequences in one or more of the markers evaluated (Table 2; Fig. 3). Many other comparisons included differences of only one or two base positions (bp), and these slight differences were generally inadequate for species delineation in tree-based methods and BLAST. Intraspecific genetic variation observed for some species of Heterodera (Table 3) spans the interspecific variation present within some species groups (e.g., Table 2). There were, however, some slight differences in sequences that were consistent enough to delineate species reliably. For example, despite having ITS sequences differing by only two bp from those of H. avenae Wollenweber, 1924, H. mani Mathews, 1971, consistently formed a monophyletic clade to the exclusion of other related species in tree-based methods.
Figure 3

Neighbor joining (A) and maximum likelihood (B) trees derived from analyses of ITS gene sequences for the Schachtii species group showing multiple instances where species cannot be differentiated.

Table 3

Intraspecific differences observed in the 18S, ITS, 28S, and coxl gene regions for those species of Heterodera for which more than one sequence of one or more of the respective gene regions are available, excluding problematic sequences.

18SITS28Scox1

Species n bp (%) n bp (%) n bp (%) n bp (%)
Heterodera arenaria 40–1; 0.50 (0–0.29; 0.15)60–2; 2 (0–0.48; 0.48)
Heterodera aucklandica 30(0)50–1; 0 (0–0.24; 0)
Heterodera australis 180–9; 3 (0–0.93; 0.31)50(0)
Heterodera avenae 170–11; 1 (0–1.09; 0.06)5420–80; 8 (0–11.89; 0.89)760–5; 0 (0–0.67; 0)910–34; 9 (0–8.17; 2.40)
Heterodera betae 21 (0.06)70–12; 3 (0–1.29; 0.37)20(0)20(0)
Heterodera bifenestra 32–10; 8 (0.20–1.02; 0.81)
Heterodera cajani 120–12; 4 (0–1.18; 0.39)130–11; 2 (0–1.46; 0.26)
Heterodera carotae 130–13; 6 (0–1.35; 0.62)80–6; 3 (0–0.83; 0.41)170–20; 3 (0–4.77; 0.82)
Heterodera ciceri 37–14; 11 (0.74–1.49; 1.18)
Heterodera cruciferae 110–10; 3 (0–1.21; 0.36)252 (9.49)31–16; 16 (0.24–4.01; 3.77)
Heterodera cyperi 20(0)
Heterodera daverti 50–2; 1 (0–0.24; 0.11)30–2; 2 (0–0.51; 0.51)
Heterodera dunensis 30(0)30(0)30(0)80–1; 0 (0–0.28; 0)
Heterodera elachista 33–4; 3 (0.25–0.36; 0.36)430–48; 11 (0–4.52; 1.05)180–4; 1 (0–0.54; 0.15)20(0)
Heterodera fici 22 (0.34)80–3; 1 (0–0.31; 0.10)90–2; 1 (0–0.47; 0.24)
Heterodera filipjevi 31–2; 1 (0.12–0.17; 0.17)2560–43; 2 (0–4.46; 0.21)100–2; 1 (0–0.30; 0.13)750–44; 10 (0–10.58; 2.42)
Heterodera glycines 60–1; 0 (0–0.16; 0)3190–112; 3 (0–11.89; 0.32)560–43; 2 (0–5.79; 0.26)810–9; 0 (0–1.04; 0)
Heterodera goettingiana 40–10; 5 (0–1.21; 0.59)150–81; 37 (0–11.34; 5.08)219 (2.92)30–17; 17 (0–4.84; 4.84)
Heterodera goldeni 50–6; 2 (0–0.59; 0.23)
Heterodera guangdongensis 30–3; 3 (0–0.26; 0.26)71–6; 4 (0.15–0.91; 0.58)
Heterodera hordecalis 25 (0.31)370–68; 7 (0–8.89; 0.84)29 (1.33)100–47; 14 (0–11.4; 3.37)
Heterodera humuli 130–4; 1 (0–0.57; 0.17)190–9; 2 (0–5.92; 0.59)
Heterodera koreana 31–22; 2 (0.17–1.27; 0.34)290–28; 7 (0–3.08; 0.73)480–21; 2 (0–2.88; 0.27)440–18; 5 (0–4.57; 1.27)
Heterodera latipons 1540–122; 13 (0–13.32; 1.46) 2 1 (0.15)160–50; 27 (0–12.05; 6.52)
Heterodera litoralis 40(0)
Heterodera mani 61–9; 3 (0.10–0.93; 0.31)60–1; 1 (0–0.27; 0.24)
Heterodera medicaginis 140–9; 3 (0–0.95; 0.42)140–2; 0 (0–0.49; 0)
Heterodera mediterranea 2 1 (0.17)61–9; 7 (0.11–0.95; 0.74) 2 3 (0.38)
Heterodera mothi 23 (0.30)22 (0.44)
Heterodera orientalis 56–18; 12 (0.66–1.94; 1.28)
Heterodera pratensis 130–10; 6 (0–1.08; 0.62)200–8; 2 (0–1.93; 0.53)
Heterodera ripae 50–4; 2 (0–0.46; 0.22)190–11; 1 (0–2.59; 0.24)
Heterodera sacchari 30–1; 1 (0–0.09; 0.09)
Heterodera salixophila 50–20; 15 (0–2.10; 1.58) 2 1 (0.15)160–18; 2 (0–4.57; 0.51)
Heterodera schachtii 91–9; 4 (0.06–0.99; 0.31)640–31; 12 (0–3.04; 1.29)200–4; 1 (0–0.58; 0.13)380–7; 1 (0–1.88; 0.12)
Heterodera sojae 50–12; 4.5 (0–1.36; 1.03)100–8; 4 (0–1.06; 0.53)120–6; 1 (0–1.59; 0.24)
Heterodera sturhani 70–3; 0 (0–0.72; 0)
Heterodera trifolii 50–8; 3 (0–0.47; 0.29)230–14; 3 (0–1.48; 0.43)190–1; 0 (0–0.15; 0)260–5; 0 (0–1.24; 0)
Heterodera ustinovi 51–8; 3.5 (0.10–0.88; 0.36)50–1; 1 (0–0.24; 0.24)
Heterodera zeae 160–39; 14 (0–5.65; 2.19)40–2; 1 (0–0.31; 0.15)

Intraspecific differences are presented in the format “base pair difference range; median of base pair difference range (percent difference range; median of percent difference range).” Medians indicated only in cases of three or more comparisons; n = the number of sequences compared.

Intraspecific differences observed in the 18S, ITS, 28S, and coxl gene regions for those species of Heterodera for which more than one sequence of one or more of the respective gene regions are available, excluding problematic sequences. Intraspecific differences are presented in the format “base pair difference range; median of base pair difference range (percent difference range; median of percent difference range).” Medians indicated only in cases of three or more comparisons; n = the number of sequences compared. Neighbor joining (A) and maximum likelihood (B) trees derived from analyses of ITS gene sequences for the Schachtii species group showing multiple instances where species cannot be differentiated. Fifteen species pairs could not be reliably delineated using both the ITS and 28S gene regions, followed by 12 species pair issues in the 18S gene region and just two in the coxI region (Table 2). Notably, several species pairs could not be distinguished across multiple markers. For example, H. avenae shares some identical ITS and 28S gene sequences with H. pratensis Gäbler, Sturhan, Subbotin & Rumpenhorst, 2000. Heterodera schachtii A. Schmidt, 1871 shares some identical sequences with H. betae Wouts, Rumpenhorst & Sturhan, 2001, and H. trifolii Goffart, 1932 in the 18S and ITS gene regions and differs from these species by only one and two bp in the 28S gene region, respectively. Sequences of coxI had the most utility for distinguishing between species, with just one case of a weak, and one case of an inadequate species pair delineation; these notable cases are discussed further below.

Discussion

Our analyses indicate that several of the most commonly employed molecular markers for the characterization of species of Heterodera lack the necessary resolution for distinguishing between some species, including a number of plant pests of significant global biosecurity concern. Among those for which molecular identification issues were detected, 13 species (H. avenae, H. carotae Jones, 1950a, H. ciceri Vovlas, Greco & Di Vito, 1985, H. cruciferae Franklin, 1945, H. daverti Wouts & Sturhan, 1978, H. elachista Ohshima, 1974, H. filipjevi (Madzhidov, 1981) Stelter, 1984, H. glycines Ichinohe, 1952, H. goettingiana Liebscher, 1892, H. hordecalis Andersson, 1975, H. oryzae Luc & Brizuela, 1961, H. schachtii and H. trifolii) are listed as regulated pests in at least one country (Singh et al., 2013). This has numerous implications for routine molecular diagnoses of species of Heterodera, such as the potential for confusing a major pest with a minor or non-pest species. For example, H. avenae is a major pest of cereals in temperate regions and causes significant annual yield losses throughout its range (e.g., Meagher, 1982; Smiley et al., 2005; Nicol and Rivoal, 2008), but our analyses showed that it could not be reliably distinguished from H. arenaria Cooper, 1955 or H. pratensis using the most commonly employed molecular marker, ITS. Both latter species parasitize non-crop grasses (Subbotin et al., 2018); thus, there is potential for confusing H. avenae with one of these non-pest species. Such cases of mistaken identity could result in misdiagnosed infestations and novel incursions. In addition, mixed infestations of some closely related species which parasitize similar crops, such as members of the Schachtii group, may go undetected. Although a combination of morphological data and sequences from multiple gene regions is best for identification of species of Heterodera, our results suggest that where resources or expertise are limited, the coxI region should be favored over 18S, ITS, and 28S for basic diagnostic purposes. It is significant that a number of economically important species of Heterodera cannot be delineated using ITS; as to date this marker has been used more extensively than any other for both identification and phylogenetic Purposes (e.g., Subbotin et al., 2001, 2017; Tanha Maafi et al., 2003). Furthermore, more species of Heterodera have been characterized with ITS than any other marker, and for many species, this is the only gene region for which sequences are available. Thus, when developing molecular diagnostic tests for plant-parasitic nematodes like Heterodera spp., the trade-off between delineation power and species coverage needs careful consideration. There is currently a great amount of interest in employing high-throughput sequencing methods, such as metabarcoding and eDNA, for detection and monitoring of pest species (e.g., Abdelfattah et al., 2018; Valentin et al., 2018; Ruppert et al., 2019; Hardulak et al., 2020; Young et al., 2021). Many metabarcoding studies have utilized short fragments of nuclear genes such as 18S for species detection and identifications (e.g., Macheriotou et al., 2019; Ruppert et al., 2019; Giebner et al., 2020), but for Heterodera, fragments of the nuclear genes tested here lack the power to distinguish between all species. Those using such tools will need to be aware of the characteristics and shortcomings of the marker(s) employed and may need to follow-up species detections with other methods to confirm identifications. The coxI gene seems robust for species delineation in the genus Heterodera; however, we did observe two instances in which this marker showed potential shortcomings. The first issue was detected in our coxI phylogenetic analyses of the Avenae group, in which sequences of H. arenaria formed a clade within the larger H. avenae species cluster. The sequences of H. arenaria differ by just two bp from those of H. avenae, well below the intraspecific variation observed for the latter species. However, these two bp appear to be unique changes, suggesting that H. arenaria can at least be distinguished from H. avenae from a barcoding, if not a phylogenetic, perspective. This is consistent with the findings of Subbotin et al. (2018) where H. arenaria and H. avenae were shown to be distinct in a haplotype network but formed a polyphyly in a tree derived from Bayesian analysis. Subbotin et al. (2018) speculated that H. arenaria represents a species recently diverged from a European population of H. avenae and remarked that, from a phytosanitary perspective, it is best to retain the species status of H. arenaria because it parasitizes coastal grasses of no economic importance, rather than cereals as in H. avenae. The second instance in which coxI showed a possible shortcoming relates to the case of distinguishing H. carotae from H. cruciferae. These two species have previously been reported as indistinguishable using PCR-ITS-RFLP profiles and ITS gene sequences (Subbotin et al., 2000, 2001). In a study of H. carotae in Mexico, Escobar-Avila et al. (2018) produced novel coxI sequences of that species but also several coxI sequences for H. cruciferae, which were the only coxI sequences available for the latter species at the time of writing. Escobar-Avila et al. (2018) found that H. carotae and H. cruciferae could not be distinguished using coxI sequences and concluded that to differentiate these species an integrated approach including morphology and a test of host range was necessary. Although it is entirely possible that H. carotae and H. cruciferae are indistinguishable using coxI, considering the former species appears restricted to carrots as hosts (Jones, 1950; Winslow, 1954; Mugniery and Bossis, 1988) and the latter to brassicas and a few species of the Lamiaceae (Winslow, 1954; Baldwin and Mundo-Ocampo, 1991), it is somewhat surprising that these species would exhibit so little divergence in such a rapidly evolving gene. Escobar-Avila et al. (2018) provided three sequences of H. cruciferae, two from specimens from the USA and one from Russia, but did not provide a morphological account of these specimens, presumably because they were all consumed in molecular analyses. The Russian sequence of H. cruciferae differs from those from the USA by 15 bp, whereas the USA sequences of H. cruciferae differ from sequences of H. carotae from multiple countries by as little as 1 bp. Thus, the current intraspecific variation for H. cruciferae is greater than the difference between some sequences of H. carotae and H. cruciferae. Because these two species are fairly similar morphologically and could easily co-occur in mixed or rotated vegetable fields, it is possible that some of the putative specimens of H. cruciferae used by Escobar-Avila et al. (2018) were misidentified. At present, there are simply too few coxI sequences of H. cruciferae to be certain of the utility of this marker, or lack thereof, for the H. carotae–H. cruciferae species pair. Heterodera carotae is an important pest of carrots throughout its range (Greco et al., 1994; Esquibet et al., 2020), and although H. cruciferae is largely considered a minor pest, there have been a few reports of significant damage to crops infested by this species (Lear et al., 1965; Sykes and Winfield, 1966). It is critical then that a suitable molecular marker be identified which can reliably distinguish between these two species. Using a collection of microsatellite markers, Esquibet et al. (2020) demonstrated a high level of genetic divergence between H. carotae and H. cruciferae and remarked that microsatellites could be used to develop a diagnostic test for these species. If further study confirms that the coxI gene cannot reliably distinguish between H. carotae and H. cruciferae, other molecular methods such as microsatellites (e.g., Gautier et al., 2019; Esquibet et al., 2020) can fill this diagnostic gap. Although the number of obviously erroneous sequences detected was low, for several species of Heterodera, we observed high levels of intraspecific variation within one or more of the molecular markers evaluated, primarily ITS. Unsurprisingly, this was most apparent in those species/marker combinations with a large pool of sequences, such as H. avenae, H. glycines, and H. latipons Franklin, 1969, each of which had over 100 ITS sequences available. In these species, we observed intraspecific maxima of 80–122 bp (12%–13%) in the ITS gene. Additionally, some species had large intraspecific variation despite lower sample size, such as H. goettingiana with 15 ITS sequences and an intraspecific maximum of 81 bp (11%) and H. hordecalis with 37 ITS sequences and an intraspecific maximum of 68 bp (9%). None of these intraspecific maxima are unbelievable when considering that these datasets include sequences of individuals sourced from many geographically disparate populations (e.g., Subbotin et al., 2003, 2018). However, these large levels of intraspecific variation do clash with the patterns observed for most other members of the genus and notably in several other species with medium to large sequence pools, such as H. schachtii and H. filipjevi, both of which exhibit intraspecific maxima of less than 5% in ITS. We think it is likely that for several species the overall level of intraspecific variation observed for some molecular markers is inflated. In our assessment of sequence reliability, in many instances, we could not be certain if the intraspecific variation observed was due to real population-level genetic variability or artifacts of poor-quality sequencing results and/or sequence editing errors. We flagged sequences that diverged greatly from their congeners as problematic; however, in larger sequence pools, relatively small levels of variation between individual sequences become amplified, resulting in the very large levels of overall intraspecific variation detected in pairwise comparisons. This is an important limitation of our analyses and of the overall pool of publicly available sequences. A related issue is species for which only a few sequences are available, but one or more of those sequences diverge significantly from the others. Again, it can be difficult to determine if such divergence represents natural variation or an artifact such as misidentification or sequence editing errors. Furthermore, the large number of sequences submitted to GenBank that are not associated with a published manuscript leaves many issues related to sequence identity ambiguous as there is no account of the morphology of specimens utilized. A good example of both of the above relates to the 28S sequences available for H. cruciferae. Sasanelli et al. (2013) performed a thorough study of H. cruciferae in Italian cabbage, identified specimens using morphology, and characterized them with ITS and 28S rDNA gene sequences. However, BLAST results of the 28S sequence of H. cruciferae from that study suggests that it is probably representative of H. schachtii, another cyst nematode widespread on brassicas (Subbotin et al., 2010b). There is one other 28S sequence of H. cruciferae available, but it is not associated with a published manuscript (KP114546; Jabbari et al., unpublished); this sequence is very close to H. carotae but not identical. This suggests that this “unpublished” sequence is accurate, but without a morphological account or other sequences with which to compare, it is still uncertain. This results in a situation where, although two 28S sequences are available for H. cruciferae, we still cannot be confident that either are truly representative of that species. Thirty species of Heterodera lack molecular data and thus cannot be diagnosed using barcoding methodologies. Of the 57 species of Heterodera for which molecular data are available, eight are represented by just a single sequence and over half of the remainder have just one sequence for one or more of the markers evaluated here. Where only a single sequence of a particular marker is available, it is best treated with caution when used for diagnostic purposes as additional sequences, ideally from independent studies, are needed for confidence that such sequences are truly representative of the species they are purported to be of. There is little doubt that erroneous sequences we observed on GenBank were uploaded in good faith. However, to avoid errors, it is important that authors carefully compare their novel sequences with those available in public databases via BLAST or other phylogenetic methods before upload. Original sequencing results should be scrutinized for base-call quality, and poor-quality sequences should be discarded. Where possible, multiple sequence replicates of each gene should be produced and compared prior to upload to ensure base calls are consistent. It is also advisable to sequence multiple gene regions from individual isolates to ensure that molecular based identification is consistent across markers. We also encourage authors who have uploaded sequences which later prove to be mislabeled or problematic to correct them. High-throughput sequencing techniques utilizing new and more accurate markers are already superseding other molecular-based diagnostic methods for cyst nematodes in many laboratories (e.g., Gautier et al., 2019; Esquibet et al., 2020). Despite this, we foresee the four markers evaluated here remaining in use for identification of plant parasitic nematodes for many years to come. With that in mind, of the four markers evaluated, coxI shows the greatest utility for identification and delineation of species of Heterodera. However, the cox1 gene might not be best for phylogenetic interpretation (Subbotin et al., 2015, 2018); so, going forward, a combination of coxI and ITS, plus 28S where possible, seems ideal, especially as coxI data are presently available for only around a third of species of Heterodera. There is a real possibility of hybridization between species of Heterodera (Potter and Fox, 1965); thus, a combination of a mitochondrial and nuclear marker is recommended for areas where the range of multiple species overlap. Molecular diagnoses should be based not only on multiple molecular markers used in concert but also on a combination of sequence matching and tree-based phylogenetic methods. Lastly, it is important to keep in mind that no one technique is likely to be a panacea for the identification of species of Heterodera. New species are being described at a fairly steady rate (e.g., Li et al., 2020; Phougeishangbam et al., 2020; Jiang et al., 2022), and sequence data continue to accumulate and provide further insight into the population genetics, host associations, and phylogeography of Heterodera (Subbotin et al., 2018; Esquibet et al., 2020; Oro and Tabakovic, 2020). It is telling that so few sequences have been generated from countries in South America and Sub-Saharan Africa as this seems an obvious result of lack of resources, rather than lack of Heterodera species richness. Thus, there is still a great need for traditionally trained taxonomists and diagnosticians that can employ a range of techniques to identify these problematic nematodes.
Table S1

Total number of sequences of species of Heterodera per country available on the NCBI database GenBank, up to 10 August 2021.

Number of sequences recorded
ITS18S28S coxI Total
Afghanistan11
Algeria627877
Australia111618
Azerbaijan36137
Belgium21121438
Canada1917532
Chile11
China7611613121937
Costa Rica1113
Cyprus2121
Czech Republic77721
Egypt7310
Estonia22
France11920
Georgia112
Germany3512864
Ghana22
Greece611715
Guatemala22
India2615142
Iran411651108
Iraq3811
Ireland123
Israel77
Italy19681245
Ivory Coast11
Japan476450161
Jordan235
Kazakhstan5353
Kyrgyzstan32133
Madagascar22
Mexico336
Morocco34539
Myanmar22
Netherlands1623728
New Zealand22711
Norway44
Pakistan213
Poland123
Portugal112
Qatar11
Russia122335
Saudi Arabia231125
Serbia7310
Slovak Republic347
South Africa3115
South Korea303836104
Spain10331329
Sweden4228
Switzerland33
Syria1512137
Tajikistan134
Tunisia412613
Turkey21953272
Ukraine31711
United Kingdom111921
USA71718158254
Vietnam213
Unknown2211125
Total1724773465922739
Table S2

Species of Heterodera, and associated number of sequences, recorded from 59 countries based on data from the NCBI database GenBank, up to 10 August 2021.

Total number of sequences recorded
SpeciesCountry18SITS28ScoxI
Heterodera arenaria Italy13
Netherlands2
United Kingdom31
Heterodera aucklandica Belgium1
New Zealand113
United Kingdom12
Heterodera australis Australia75
China11
Heterodera avenae Algeria46
Australia1
Azerbaijan9
China4359574
Czech Republic777
Egypt53
France97
Germany34
India41
Iran214
Iraq3
Israel34
Morocco275
Norway4
Pakistan1
Saudi Arabia2311
Serbia11
South Korea121
Spain33
Switzerland1
Syria36
Tunisia16
Turkey3926
United Kingdom1
USA110314
Unknown4
Heterodera betae Belgium1
Germany3
Netherlands322
Unknown2
Heterodera bifenestra Belgium2
Sweden11
Heterodera cajani India1213
Heterodera cardiolata Pakistan11
United Kingdom1
Heterodera carotae Belgium1
Algeria2
Canada565
France12
Italy315
Mexico21
South Africa111
Switzerland1
Heterodera ciceri Syria31
Heterodera circeae Germany1
Heterodera cruciferae Algeria5
Iran11
Italy11
Netherlands1
Russia11
Serbia1
USA12
Heterodera cyperi Spain1
USA11
Heterodera daverti Germany32
Italy111
Netherlands11
Heterodera dunensis Spain3338
Heterodera elachista China12710
Iran211
Italy2442
Japan113
Heterodera fici Canada2
Georgia11
Greece4
Iran13
Italy21
Qatar1
USA211
Heterodera filipjevi Algeria2
Azerbaijan20
China5872
Germany21
Greece1
India1
Iran519
Italy1
Kazakhstan25
Kyrgyzstan31
Russia37
Serbia12
Slovakia1
Spain12
Sweden11
Syria6
Tajikistan13
Turkey8020
Ukraine3
United Kingdom13
USA315
Unknown2
Heterodera glycines Canada1121
Chile1
China527529
Iran21
Japan610
South Korea USA14 217 781
Unknown11
Heterodera goettingiana China11013
Germany21
Iran1
Ireland1
Italy2
Netherlands4
USA1
Heterodera goldeni Egypt2
Iran2
Israel1
Heterodera graminis India1
Heterodera graminophila USA1
Heterodera guangdongensis China33
Myanmar2
Vietnam21
Heterodera hainanensis China11
Heterodera hordecalis Algeria2212
Estonia1
Germany12
Iran43
Israel32
Italy2
Netherlands12
Slovakia12
Sweden11
Tunisia1
United Kingdom1
Heterodera humuli Belgium1
Germany12
Iran318
Kyrgyzstan11
Portugal1
Russia12
USA65
Heterodera koreana China122
Iran11
Japan204141
South Korea333
USA231
Heterodera latipons Cyprus21
Iran33
Israel2
Jordan23
Kazakhstan28
Morocco3
Russia11
Syria917
Turkey873
Heterodera litoralis New Zealand114
Heterodera mani Germany41
United Kingdom1
USA14
Unknown1
Heterodera medicaginis Russia1
USA13114
Heterodera mediterranea Italy22
Spain2
Tunisia122
Heterodera microulae China111
Heterodera mothi Azerbaijan11
Iran11
Heterodera orientalis Guatemala2
Russia1
USA21
Heterodera oryzae South Korea111
Heterodera oryzicola India11
Heterodera persica Iran1
Heterodera pratensis Belgium1
Germany88
Iran5
Netherlands11
Russia12
South Korea212
USA11
Heterodera ripae Belgium Germany15 4
Greece12
Russia Serbia1 14
Slovakia2
Sweden1
United Kingdom2
Heterodera sacchari Ghana2
Ivory Coast1
Heterodera salixophila Belgium113
Germany4
Iran11
Russia5
Ukraine34
Heterodera schachtii Algeria1
Australia4
Belgium115
China1183
France13
Germany4
Iran21
Ireland2
Japan111
Mexico12
Morocco3
Netherlands811
Poland11
Serbia3
South Africa1
South Korea347
Turkey12
USA2322
Unknown61
Heterodera scutellariae Germany1
Heterodera sinensis China1
Heterodera skohensis India1
Heterodera sojae China11
Japan11
South Korea3812
Heterodera sorghi India1
Heterodera sturhani China7
Heterodera trifolii Costa Rica111
Estonia1
Iran112
Italy1
Japan888
South Korea889
United Kingdom121
USA25
Unknown4
Heterodera turcomanica Iran11
Heterodera urticae Belgium113
Heterodera ustinovi Belgium1
Germany1
Slovakia1
United Kingdom1
USA12
Heterodera vallicola Russia11
Heterodera zeae Afghanistan1
China5
Greece151
India4
Portugal1
USA21
  26 in total

1.  Phylogenetic relationships within the cyst-forming nematodes (Nematoda, Heteroderidae) based on analysis of sequences from the ITS regions of ribosomal DNA.

Authors:  S A Subbotin; A Vierstraete; P De Ley; J Rowe; L Waeyenberge; M Moens; J R Vanfleteren
Journal:  Mol Phylogenet Evol       Date:  2001-10       Impact factor: 4.286

2.  Phylogenetic Relationships Among Selected Heteroderoidea Based on 18S and ITS Ribosomal DNA.

Authors:  V R Ferris; A Sabo; J G Baldwin; M Mundo-Ocampo; R N Inserra; S Sharma
Journal:  J Nematol       Date:  2004-09       Impact factor: 1.402

3.  Molecular phylogeny of the Tylenchina and evolution of the female gonoduct (Nematoda: Rhabditida).

Authors:  Wim Bert; Frederik Leliaert; Andy R Vierstraete; Jacques R Vanfleteren; Gaetan Borgonie
Journal:  Mol Phylogenet Evol       Date:  2008-04-14       Impact factor: 4.286

4.  Cereal Cyst Nematodes: A Complex and Destructive Group of Heterodera Species.

Authors:  Richard W Smiley; Abdelfattah A Dababat; Sadia Iqbal; Michael G K Jones; Zahra Tanha Maafi; Deliang Peng; Sergei A Subbotin; Lieven Waeyenberge
Journal:  Plant Dis       Date:  2017-08-15       Impact factor: 4.438

5.  MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Authors:  Sudhir Kumar; Glen Stecher; Michael Li; Christina Knyaz; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2018-06-01       Impact factor: 16.240

6.  DNA metabarcoding for biodiversity monitoring in a national park: Screening for invasive and pest species.

Authors:  Laura A Hardulak; Jérôme Morinière; Axel Hausmann; Lars Hendrich; Stefan Schmidt; Dieter Doczkal; Jörg Müller; Paul D N Hebert; Gerhard Haszprunar
Journal:  Mol Ecol Resour       Date:  2020-07-01       Impact factor: 7.090

7.  Comparing diversity levels in environmental samples: DNA sequence capture and metabarcoding approaches using 18S and COI genes.

Authors:  Hendrik Giebner; Kathrin Langen; Sarah J Bourlat; Sandra Kukowka; Christoph Mayer; Jonas J Astrin; Bernhard Misof; Vera G Fonseca
Journal:  Mol Ecol Resour       Date:  2020-06-24       Impact factor: 7.090

Review 8.  The global importance of the cereal cyst nematode (Heterodera spp.) on wheat and international approaches to its control.

Authors:  J M Nicol; I H Elekçioğlu; N Bolat; R Rivoal
Journal:  Commun Agric Appl Biol Sci       Date:  2007

9.  Metabarcoding free-living marine nematodes using curated 18S and CO1 reference sequence databases for species-level taxonomic assignments.

Authors:  Lara Macheriotou; Katja Guilini; Tania Nara Bezerra; Bjorn Tytgat; Dinh Tu Nguyen; Thi Xuan Phuong Nguyen; Febe Noppe; Maickel Armenteros; Fehmi Boufahja; Annelien Rigaux; Ann Vanreusel; Sofie Derycke
Journal:  Ecol Evol       Date:  2019-01-22       Impact factor: 2.912

10.  Morphological and molecular characterization of Heterodera dunensis n. sp. (Nematoda: Heteroderidae) from Gran Canaria, Canary Islands.

Authors:  Phougeishangbam Rolish Singh; Gerrit Karssen; Marjolein Couvreur; Wim Bert
Journal:  J Nematol       Date:  2020-11-24       Impact factor: 1.402

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.