| Literature DB >> 31405370 |
Guru V Radhakrishnan1, Nicola M Cook1, Vanessa Bueno-Sancho1, Clare M Lewis1, Antoine Persoons1, Abel Debebe Mitiku2, Matthew Heaton1, Phoebe E Davey1, Bekele Abeyo3, Yoseph Alemayehu3, Ayele Badebo3, Marla Barnett4, Ruth Bryant5, Jeron Chatelain4, Xianming Chen6, Suomeng Dong7, Tina Henriksson8, Sarah Holdgate9, Annemarie F Justesen10, Jay Kalous4, Zhensheng Kang11, Szymon Laczny12, Jean-Paul Legoff13, Driecus Lesch14, Tracy Richards4, Harpinder S Randhawa15, Tine Thach10, Meinan Wang6, Mogens S Hovmøller10, David P Hodson3, Diane G O Saunders16.
Abstract
BACKGROUND: Effective disease management depends on timely and accurate diagnosis to guide control measures. The capacity to distinguish between individuals in a pathogen population with specific properties such as fungicide resistance, toxin production and virulence profiles is often essential to inform disease management approaches. The genomics revolution has led to technologies that can rapidly produce high-resolution genotypic information to define individual variants of a pathogen species. However, their application to complex fungal pathogens has remained limited due to the frequent inability to culture these pathogens in the absence of their host and their large genome sizes.Entities:
Keywords: Disease diagnostics; Genomics; Nanopore sequencing; Pathogen surveillance; Point of care; Wheat rust
Mesh:
Year: 2019 PMID: 31405370 PMCID: PMC6691556 DOI: 10.1186/s12915-019-0684-y
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Fig. 1The global Pst population is highly diverse and largely consists of geographically isolated groups of distinct homogenous individuals. a The global Pst population analysed herein consisted of 14 distinct groups of individuals. Phylogenetic analysis was performed on a total of 280 transcriptomic and 21 genomic datasets from Pst isolates spanning 24 countries, using a maximum-likelihood model and 100 bootstraps. Scale indicates the mean number of nucleotide substitutions per site. Bootstrap values are provided in Additional file 3. b Multivariate discriminant analysis of principal components (DAPC) could further define subdivisions within the global Pst population. A list of 135,139 biallelic synonymous single nucleotide polymorphisms (SNPs) was used for DAPC analysis. Assessment of the Bayesian Information Criterion (BIC) supported initial division of the Pst isolates into five genetically related groups (left; C1–5). Due to the high level of diversity among the global Pst population, this initial analysis could not resolve Pst isolates with lower levels of within-group variation. Therefore, a second DAPC analysis was carried out on each of the five initial population groups (right). Bar charts represent DAPC analysis, with each bar representing estimated membership fractions for each individual. Roman numerals represent the successive K values for each DAPC analysis. Numbers in circles are reflective of those assigned to distinct groups in the phylogenetic analysis
Fig. 2The sequences of 242 highly polymorphic Pst genes are sufficient to reconstruct the topology of the global phylogeny generated from full transcriptome and genome sequencing. a Ordered distribution of average SNP content per gene across the 301 Pst global isolates. To determine the minimum number of gene sequences required to accurately reconstruct the global phylogeny, the 1690 genes identified as polymorphic (SNPs/kb ≥ 0.001) between Pst isolates were ordered according to number of polymorphic sites across the 301 global Pst isolates. b The 242 polymorphic genes selected were not biased in their selection by a high degree of divergence from the reference race PST-130 for any particular group of individuals. Box plots represent the total number of SNPs across these 242 genes for Pst isolates belonging to each of the five major genetic groups identified through DAPC analysis. Bar represents median value, box signifies the upper (Q3) and lower (Q1) quartiles, data falling outside the Q1–Q3 range are plotted as outliers. c The 242 genes selected could be used successfully to reconstruct the global phylogeny and assign Pst isolates to the 14 previously defined groups (numbers in circles). Phylogenetic analysis was performed using sequence data for the 242 genes from the 301 global Pst isolates using a maximum-likelihood model and 100 bootstraps. Bootstrap values are provided in Additional file 7
Fig. 3The 242 Pst genes selected are evenly distributed across the Pst genome and a large proportion encode proteins with enzymatic functions. a For 241 of the 242 genes, near-identical (> 94% pairwise identity) hits were identified in the more contiguous Pst-104 genome and 60% were located on scaffolds that contained only one of the 241 genes. Bar chart illustrates the number of genes identified on the given numbers of scaffolds. b Functional annotation of the 242 Pst genes selected for MinION sequencing revealed that they largely encode proteins with enzymatic functions. Bar charts illustrate GO term analysis, with gene functions associated with ‘Biological process’, ‘Metabolic function’ and ‘Cellular component’ highlighted
Fig. 4A minimum of 20x depth of coverage on the MinION sequencer is sufficient to generate comparable gene sequence data to the Illumina MiSeq platform. a At 20x coverage on the MinION sequencer, comparisons with data generated on the Illumina MiSeq platform showed 98.74% sequence identity. b No notable selective bias occurred during library preparation and sequencing of individual genes using either the MiSeq or MinION platforms. Box plots show the percentage coverage for each of the 242 Pst genes sequenced for the four Pst isolates tested on the MinION and MiSeq platforms. c The number of SNPs per gene detected in each of the four MinION datasets was comparable to that from the MiSeq platform. Heatmaps represent the number of SNPs identified per gene (y-axis) for the four Pst isolates sequenced on the MinION and MiSeq platforms. Full details regarding the number of SNPs identified per gene are provided in Additional file 1: Table S9. In box plots a and b, bars represent median value, boxes signify the upper (Q3) and lower (Q1) quartiles, data falling outside the Q1–Q3 range are plotted as outliers
Fig. 5Gene sequencing on the MinION platform can be used to accurately genotype Pst isolates and define specific race groups. All Ethiopian Pst isolates collected from 2016 onwards cluster in a single monophyletic group (orange diamonds). The 13 representatives of previously defined race groups (numbered squares) tended to cluster in the phylogeny with Pst isolates of a similar genetic background. Phylogenetic analysis was carried out using a maximum-likelihood model and 100 bootstraps. Scale indicates the mean number of nucleotide substitutions per site. Bootstrap values are provided in Additional file 10
Fig. 6Illustration of the MARPLE pipeline. A simplistic Mobile And Real-time PLant disEase (MARPLE) diagnostics pipeline was developed so that the 242 polymorphic Pst genes could be amplified and sequenced on the MinION platform for population genetic analysis in situ. This pipeline consists of three stages (DNA preparation, Sequencing and Data analysis) and can be executed independently of stable electricity or internet connectivity in less than 2 days from sample collection to completion of the phylogenetic analysis