Literature DB >> 33108372

Proteomic fingerprinting of Neotropical hard tick species (Acari: Ixodidae) using a self-curated mass spectra reference library.

Rolando A Gittens1,2, Alejandro Almanza1, Kelly L Bennett1,3, Luis C Mejía1,3, Javier E Sanchez-Galan1,4, Fernando Merchan5, Jonathan Kern5,6, Matthew J Miller7,8, Helen J Esser9, Robert Hwang10, May Dong10, Luis F De León1,11, Eric Álvarez12, Jose R Loaiza1,3,12.   

Abstract

Matrix-assisted laser desorption/ionization (MALDI) time-of-flight mass spectrometry is an analytical method that detects macromolecules that can be used for proteomic fingerprinting and taxonomic identification in arthropods. The conventional MALDI approach uses fresh laboratory-reared arthropod specimens to build a reference mass spectra library with high-quality standards required to achieve reliable identification. However, this may not be possible to accomplish in some arthropod groups that are difficult to rear under laboratory conditions, or for which only alcohol preserved samples are available. Here, we generated MALDI mass spectra of highly abundant proteins from the legs of 18 Neotropical species of adult field-collected hard ticks, several of which had not been analyzed by mass spectrometry before. We then used their mass spectra as fingerprints to identify each tick species by applying machine learning and pattern recognition algorithms that combined unsupervised and supervised clustering approaches. Both Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) classification algorithms were able to identify spectra from different tick species, with LDA achieving the best performance when applied to field-collected specimens that did have an existing entry in a reference library of arthropod protein spectra. These findings contribute to the growing literature that ascertains mass spectrometry as a rapid and effective method to complement other well-established techniques for taxonomic identification of disease vectors, which is the first step to predict and manage arthropod-borne pathogens.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 33108372      PMCID: PMC7647123          DOI: 10.1371/journal.pntd.0008849

Source DB:  PubMed          Journal:  PLoS Negl Trop Dis        ISSN: 1935-2727


Introduction

Hard ticks (Ixodidae) are hematophagous ectoparasites that feed on almost every species of terrestrial vertebrate on earth, including Homo sapiens sapiens [1, 2]. Due to a complete dependency on blood as a food source, both sexes of adults and immature ticks are capable of transmitting disease pathogens to their hosts, causing significant morbidity and sometimes even death [3, 4]. Research on hard ticks has increased recently in the Neotropics, where a growing number of outbreaks of tick-borne related illnesses have been documented [5-8]. Despite these efforts, comprehensive studies about the ecology, behavior and control of hard ticks relevant to public health remain elusive in Central America due to the shortcomings of traditional taxonomic methods for species identification [9]. Taxonomic identification of Neotropical Ixodidae has traditionally relied on adult morphological characters [10]; however, morphological keys for immature stages (i.e., larvae and nymphs) are lacking and experts are often unable to reliably identify immature ticks to species [10, 11]. Moreover, morphological identification of ticks is unrealistic in epidemiological settings because assessing the role of ticks as disease vectors usually involves identifying hundreds of individuals for pathogen screening, an extremely time-consuming effort, which may be further impeded by the lack of qualified taxonomic specialists [12]. Matrix-assisted laser desorption/ionization (MALDI) time-of-flight mass spectrometry is an analytical technique that allows for sensitive and accurate detection of complex molecules such as proteins, peptides, lipids and nucleic acids [13-15]. The conventional MALDI approach has been used successfully for proteomic fingerprinting through pattern recognition for the identification of microorganisms such as pathogenic bacteria and fungi, which can be cultured in the laboratory and form discrete colonies with very consistent mass spectra that facilitates the development of reference libraries for identification of unknown samples [16, 17]. In fact, a commercial program offered by the manufacturers of the MALDI technology is capable of determining statistical similarities between the spectra of unknown samples and a well-curated, proprietary reference library of bacteria and fungi to identify the species of the unknown specimen. This is analogous to the process of matching fingerprints, and offers a simplified comparison score that ranges from 0.0 to 3.0. Scores above or equal to 2.3 represent a confident match at the genus rank, and high probability at the species level, while values below 1.7 are considered as non-reliable identifications [16-18]. Although more challenging than identifying bacteria and fungi due to the size and heterogeneity of the specimen, MALDI has also been used to discriminate among species of invertebrates, including mosquitoes (Culicidae—Anopheles), fleas (Pulicidae—Ctenocephalide), biting midges (Ceratopogonidae–Culicoides), sandflies (Psychodidae–Phlebotomus, Lutzomyia) and ticks (Ixodidae–Rhipicephalus) [19-27]. A key finding from these studies is that protein spectra obtained from body sections or whole specimens were similar among individuals of the same morphological species but differed noticeably across different species. Therefore, MALDI protein spectra can be used as a tool to delimit species boundaries in arthropods that are vectors of pathogens. Nevertheless, fresh laboratory-reared specimens are routinely needed to build a reference library that meets the high-quality standards required for classification. This represents an important limitation for some arthropod groups, or assemblages, that are difficult to rear under laboratory conditions. In addition, epidemiological studies often rely on field-collected specimens preserved in ethanol for long-term storage in reference collections. To overcome these limitations, previous studies have opted for adjusting the comparison scores minimum-threshold limit for identification, lowering the manufacturer´s recommended scores from 2.3 to 1.8 [22, 28] or even 1.3 [23, 29]. Hence, mass fingerprinting for the identification of field-collected specimens that do not exist in a reference spectra library (or for those from which reference spectra cannot be generated under ideal conditions) requires an alternative, objective approach [12]. Moreover, most existing applications of MALDI to identify arthropod disease vectors have focused on relatively species-poor vector assemblages from Europe. This technique has been tested less frequently in the new world tropics [20, 21, 23, 25, 28–37], where vector species richness is the greatest on Earth. Here, we used MALDI as a scheme to identify Neotropical specimens of adult hard ticks derived from ethanol-preserved field collections. Specifically, we used machine learning and pattern recognition algorithms to classify protein spectra from the legs of field-collected specimens in order to identify a group of unknown samples with a self-curated reference library. MALDI is a promising tool for cataloging and quickly identifying large arthropod groups such as ticks [12]. Our results should contribute to the growing body of literature trying to address questions about feasibility, reliability and universality of the methodology for different environments and species that have not been evaluated before. Properly identifying disease vectors such as Ixodidae in highly diverse Neotropical countries, such as Panama, is a critical first step to predict and manage tick-borne zoonotic pathogens such as Rickettsia and arboviruses (i.e., arthropod-borne viruses).

Methods

Sample preparation

Ticks stored in ethanol for up to 5 years, and previously identified based on morphological characters, were taken from long-term storage in a -20°C freezer (S1 Table). A total of 103 specimens from the following species were included in this study: Amblyomma mixtum (cajennense), Amblyomma calcaratum, Amblyomma dissimile, Amblyomma geayi, Amblyomma nodosum, Amblyomma oblongoguttatum, Amblyomma ovale, Amblyomma pecarium, Amblyomma sabanerae, Amblyomma varium, Amblyomma naponense, Amblyomma tapirellum, Ixodes affinis, Ixodes boliviensis, Dermacentor nitens, Haemaphysalis juxtackochi, Rhipicephalus microplus and Rhipicephalus sanguineus. Samples were prepared following previously published protocols with minor modifications [22, 23]. Briefly, we removed either the left or the right anterior leg from each tick specimen using a scalpel. The leg was then put in a tube with 300 μL ultrapure water followed by the addition of 900 μL 100% ethanol. The tube was vortexed for 15 seconds and centrifuged using a Heraeus Biofuge Pico microcentrifuge (Thermo Fisher Scientific, Waltham, MA, USA) at 17,000 g for 2 minutes. After centrifugation, the supernatant was poured off from the sample tube, which was left to dry for 15 minutes. Subsequently, the leg was resuspended in 60 μL 70% formic acid and 60 μL 100% acetonitrile and homogenized in the microtube using a manual pestle. The sample was placed in a Branson 1510 ultra-sonicator (Bransonic, Danbury, CT, USA) for 60 minutes in ice water, and then vortexed for 15 seconds and centrifuged again at 17,000 g for 2 minutes. For peptide detection with mass spectrometry, a saturated solution (10 mg/mL) of α-cyano-4-hydroxycinnamic acid (HCCA) matrix was prepared in 30:70 [v/v] acetonitrile: 0.1% trifluoroacetic acid (TFA) in water. An aliquot of 1 μL from the sample supernatant was pre-mixed with an equal volume of HCCA matrix, and 1 μL of the mix was quickly pipetted onto a polished steel MALDI plate in its respective target spot. All samples were placed and measured on three individual target spots with spectra from three technical replicates collected per spot. After letting the plate dry, it was inserted into the MALDI mass spectrometer to record the protein spectra from the tick´s leg.

MALDI mass spectrometry parameters

We used an UltrafleXtreme spectrometer (Bruker Daltonics, Bremen, Germany) to generate the protein mass spectra of each specimen. The equipment has a MALDI source, a time-of-flight (TOF) mass analyzer, and a 2 kHz Smartbeam-II neodymium-doped yttrium aluminum garnet (Nd:YAG) solid-state laser (λ = 355 nm) that we used in positive polarization mode. All spectra were automatically acquired in the range of 2,000 to 20,000 m/z in linear mode for the detection of the most abundant protein ions. Each spectrum represented the accumulation of 5,100 shots with 300 shots taken at a time, and the acquisition was done in random-walk mode with a laser power in the range of 50% to 100% (global laser attenuation at 30%). The software FlexAnalysis (Bruker) was used to pre-process and evaluate the mass spectra quality, based on the number of ion peaks and their intensity. Initially, all sample spectra were normalized by applying a general algorithm for baseline subtraction and smoothing provided by the software. Visual comparisons of the mass spectra from different tick species gave initial indications of dominant ion peaks that would suggest possible classification into discrete groups. Mass spectra that did not include at least one ion peak with an intensity of 1000 a.u. or more, were considered low quality and filtered out. All samples were placed and measured on three individual target spots, with three technical replicates of the mass spectra collected per spot.

Data analysis, clustering algorithms and statistics

The methodology has been described in detail previously by our group for the identification of adult mosquito legs [27], based on similar data analysis for face recognition [38, 39] and spectral classification using mass spectrometry [40, 41]. In brief, 239 mass spectra generated across 103 samples for all 18 species of morphologically identified Neotropical hard ticks were classified with a custom-made algorithm developed by our group using MATLAB (MathWorks, Natick, MA, USA). The algorithm is based on Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), which are linear transformation techniques from the field of Machine Learning that are commonly used for dimensionality reduction and classification. Dimensionality reduction can help decrease computational costs for classification, as well as avoid overfitting by minimizing the error in parameter estimation. Overfitting was also addressed by maximizing the number of specimens analyzed per species, while minimizing the number of technical replicas (i.e., only three spectra per specimen with good signal intensity were used for data analysis). PCA is an “unsupervised” algorithm that generates vectors that correspond to the direction of maximal variance in the sample space. On the other hand, LDA is a “supervised” algorithm that considers class information to provide a basis that best discriminates the classes (i.e., tick species) [38]. For both PCA and LDA analyses, we calculated the Euclidean distance between the vector describing the test sample and the average vector describing each class to identify a test sample. The class with the minimum distance with respect to the test sample was assigned as the identified species for that test sample. The LDA was applied over the data set expressed in terms of the coefficients (i.e., principal components) obtained by the PCA. Thus, PCA reduced the dimensionality of the data, and the LDA provided the supervised classification. The performance of the clustering algorithms was tested using Monte Carlo simulations over 1000 iterations per species to optimize training and cross-validation prediction success rates. For each iteration, the data elements in each class were split randomly in approximately, but not less than, 20% of the elements for testing and the rest of the elements for training, for each species. We used all the peaks in the spectra for the PCA analysis, and the first 150 principal components from the PCA stage that explained 99.9% of the total variance were then projected for the LDA algorithm, which also generated a 150-components data set. The number of components was chosen after a performance analysis, again using a Monte Carlo approach, that provided the best identification rates. Global and class positive identification rates were calculated to establish the classification capacity of the algorithm. The positive identification rate corresponds to the percent ratio between positive identifications performed by the algorithm and the real positive cases in the data. For visualization purposes in the plots generated with our algorithm in MATLAB, species that were morphologically identified within the Rhipicephalus and Ixodes genera were separately compared against Dermacentor and Haemaphysalis for which there was only one species in each. All species that were morphologically identified within the Amblyomma genus were separately compared between themselves or against the Ixodes genera.

Results

Optical micrographs from 18 species of Neotropical hard ticks showed evident differences among species in terms of adult morphological features (Fig 1), which was well aligned with the expected unique mass spectra generated from each sample and taxon (Fig 2, S1 Fig, S2 Fig and S3 Fig). The global automatic acquisition rate was 77% for all species (Table 1), confirming that, overall, the mass spectra of field-collected and ethanol-preserved specimens allowed automatic acquisition of spectra. In fact, automatic acquisition of spectra results in faster and more objective data acquisition than performing spectra collection manually. However, the automatic spectra collection, coupled to the fact that species had different starting number of specimens, meant that the number of spectra per species for data analysis was not the same and, in some cases, did not meet the expected number of spectra per specimen (Table 1). Still, this was not an obstacle for our data analysis clustering algorithm. The percentage of automatic spectra acquisition with the MALDI ranged from 50% for A. mixtum (cajennense), I. boliviensis and R. sanguineus to 100% for several of the species, including A. calcaratum, A. geayi, A. sabanerae, I. affinis, and R. microplus, covering a range from 6 to 56 spectra per species (Table 1). The time stored in ethanol or the location of sample origin did not seem to explain the variable percentages of automatic spectra collection (S1 Table). Spectra from freshly collected specimens stored dry at -20°C, used to establish the methodology, exhibited the best signals, with better-defined spectral peaks and higher signal-to-noise ratio.
Fig 1

Optical micrographs of Neotropical hard ticks.

The image shows the dorsal and ventral sides for all 18 species of hard ticks in the genera Amblyomma, Dermacentor, Haemaphysalis, Ixodes, and Rhipicephalus used to generate protein spectra with our MALDI mass spectrometry approach.

Fig 2

Baseline-corrected and smoothed spectra for 18 species of ticks in the genus Amblyomma, Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus.

Major ion peaks and their molecular weights are annotated in the range of 2,000 to 20,000 m/z for all species.

Table 1

Description of specimens subjected to analysis with the MALDI mass spectrometry procedure.

Species Name# of specimensLocality code# of expected spectra# of obtained spectraMALDI automatic spectra acquisition rate (%)
Amblyomma mixtum (cajennense)4a12650%
Amblyomma calcaratum5a, b1515100%
Amblyomma dissimile4c12975%
Amblyomma geayi4d1212100%
Amblyomma nodosum4a121083%
Amblyomma oblongoguttatum4a, e12867%
Amblyomma ovale4e121192%
Amblyomma pecarium4e121192%
Amblyomma sabanerae3f99100%
Amblyomma varium4g12975%
Amblyomma naponense5f15960%
Amblyomma tapirellum *26e, g785672%
Ixodes affinis4e1212100%
Ixodes boliviensis4e12650%
Dermacentor nitens4c12975%
Haemaphysalis juxtackochi6a, e181161%
Rhipicephalus microplus10c, d3030100%
Rhipicephalus sanguineus4a12650%
Total103a-g30923977%

(a) = Panama: West Panama, Las Pavas; (b) = Panama: Colon, Madden Road; (c) = Panama: Colon, Achiote; (d) = Panama: West Panama, Capira; (e) Panama: Colon, Barro Colorado Island; (f) Panama: Colon, Sierra Llorona Lodge; (g) Panama: Colon, Gamboa. (*) Indicates some specific specimens that upon collection were stored fresh in Silica Gel (For more metadata information about these samples see also S1 Table).

Optical micrographs of Neotropical hard ticks.

The image shows the dorsal and ventral sides for all 18 species of hard ticks in the genera Amblyomma, Dermacentor, Haemaphysalis, Ixodes, and Rhipicephalus used to generate protein spectra with our MALDI mass spectrometry approach.

Baseline-corrected and smoothed spectra for 18 species of ticks in the genus Amblyomma, Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus.

Major ion peaks and their molecular weights are annotated in the range of 2,000 to 20,000 m/z for all species. (a) = Panama: West Panama, Las Pavas; (b) = Panama: Colon, Madden Road; (c) = Panama: Colon, Achiote; (d) = Panama: West Panama, Capira; (e) Panama: Colon, Barro Colorado Island; (f) Panama: Colon, Sierra Llorona Lodge; (g) Panama: Colon, Gamboa. (*) Indicates some specific specimens that upon collection were stored fresh in Silica Gel (For more metadata information about these samples see also S1 Table). In addition, the specimens within each species showed consistently similar protein profiles, regardless of their taxonomic genera, sex, collection date and/or sampling location (S1 Fig, S2 Fig, S3 Fig). Mean protein spectra for tick species differed visually among taxa and the differences appeared to be related to their degree of phylogenetic relatedness (Fig 2). For example, species within the genera Ixodes, Rhipicephalus, and Amblyomma were more similar among themselves in terms of the ions peak number and mass over charge (m/z) position in their mass spectra than species from different genera. Nonetheless, some closely related species within the Amblyomma genus such as A. mixtum (cajennense), A. varium, and A. tapirellum also showed fairly distinct protein spectra (Fig 2), which motivated the application of clustering algorithms for their classification. Distinct mass spectra profiles between morphologically identified ixodid species could be classified by an unsupervised PCA algorithm to identify specimens. The quantitative performance of the PCA algorithm was assessed per species (Table 2), and visually confirmed with the graphic clustering presented in 3D plots (Fig 3). The PCA global positive identification rate was 91.2%, with 14 out of 18 species having higher than 90% positive identification rate. The PCA graphs showed that most species separated in well-defined clusters, and the distance among clusters seemed to be related to the degree of phylogenetic relatedness as evidenced by the clear separation from the specimens of Dermacentor and Rhipicephalus with those from Haemaphysalis and Ixodes (Fig 3A and 3B), or just between the specimens of Amblyomma (Fig 3C). When comparing species within the genus Amblyomma against those from Ixodes, again the spectra from specimens of each species clustered together with limited overlap between groups and those from different genera were clearly separated (Fig 3D).
Table 2

Performance of PCA and LDA clustering algorithms.

Species NamePCA Positive Identification Rate (%)LDA Positive Identification Rate (%)Spectra per Class# Training Elements# Test Elements
Amblyomma mixtum (cajennense)100.0%100.0%640002000
Amblyomma calcaratum100.0%99.6%15120003000
Amblyomma dissimile67.6%67.6%970002000
Amblyomma geayi99.1%99.6%1290003000
Amblyomma nodosum100.0%100.0%1080002000
Amblyomma oblongoguttatum100.0%100.0%860002000
Amblyomma ovale100.0%100.0%1180003000
Amblyomma pecarium99.8%99.0%1180003000
Amblyomma sabanerae69.3%85.9%970002000
Amblyomma varium99.8%100.0%970002000
Amblyomma naponense100.0%100.0%970002000
Amblyomma tapirellum97.8%97.8%564400012000
Dermacentor nitens21.7%45.6%1290003000
Haemaphysalis juxtackochi90.9%97.8%640002000
Ixodes affinis84.0%89.5%970002000
Ixodes boliviensis96.8%98.8%1180003000
Rhipicephalus microplus93.1%98.7%30240006000
Rhipicephalus sanguineus100.0%100.0%640002000
Global91.2%94.2%23918300056000
Fig 3

Principal component analysis (PCA) of individual species plotted against first, second and third principal components (PC).

All species were classified using a Monte Carlo simulation with 1000 iterations, in which 80% of the samples were used as training set (⎕) and the remaining 20% as test set (• for positive identifications and + for negative ones). The cluster centroid of each species is also presented in the graph (⋄). The plots show (A) the training and test sets for the species belonging to the Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus genera, and (B) only the test sets for better visualization; as well as the training set and test set of (C) Amblyomma species alone or (D) Amblyomma in combination with Ixodes genera. The unsupervised PCA algorithm had a global positive identification rate of 91.2%. These 3D plots represent only one of the 1000 Monte Carlo iterations performed with the algorithm.

Principal component analysis (PCA) of individual species plotted against first, second and third principal components (PC).

All species were classified using a Monte Carlo simulation with 1000 iterations, in which 80% of the samples were used as training set (⎕) and the remaining 20% as test set (• for positive identifications and + for negative ones). The cluster centroid of each species is also presented in the graph (⋄). The plots show (A) the training and test sets for the species belonging to the Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus genera, and (B) only the test sets for better visualization; as well as the training set and test set of (C) Amblyomma species alone or (D) Amblyomma in combination with Ixodes genera. The unsupervised PCA algorithm had a global positive identification rate of 91.2%. These 3D plots represent only one of the 1000 Monte Carlo iterations performed with the algorithm. In addition, the LDA clustering analysis showed a global positive identification rate of 94.2% (Fig 4; Table 2), with 14 out of 18 species having higher than 97.8% positive identification rate. The range of positive identification rates went from 100% (best score possible) for A. mixtum (cajennense), A. nodosum, A. oblongoguttatum, A. ovale, A. varium, A. naponense and R. sanguineus to 45.6% for D. nitens. The 3D representation plots of the LDA clustering displayed that the separation between species was more pronounced than with PCA when comparing species from different genera, confirming the improved quantitative results of the performance of the LDA algorithm (Table 2).
Fig 4

Linear Discriminant Analysis (LDA) applied to spectra from tick species of the genera Amblyomma, Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus.

The plots show (A) the training and test sets for species in the Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus genera projected over the first three components of the LDA, as well as (B) only the test set for better visualization; and also the training and test sets for (C) the Amblyomma genus alone, as well as (D) the Amblyomma genus compared to the Ixodes genus. These 3D plots represent only one of the 1000 Monte Carlo iterations performed with the algorithm. The supervised LDA algorithm had a 94.2% global positive identification rate.

Linear Discriminant Analysis (LDA) applied to spectra from tick species of the genera Amblyomma, Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus.

The plots show (A) the training and test sets for species in the Dermacentor, Haemaphysalis, Ixodes and Rhipicephalus genera projected over the first three components of the LDA, as well as (B) only the test set for better visualization; and also the training and test sets for (C) the Amblyomma genus alone, as well as (D) the Amblyomma genus compared to the Ixodes genus. These 3D plots represent only one of the 1000 Monte Carlo iterations performed with the algorithm. The supervised LDA algorithm had a 94.2% global positive identification rate.

Discussion

Our results show that MALDI mass spectra of highly abundant proteins in arthropod legs served as fingerprints to identify samples of 18 species of Neotropical hard ticks using machine learning and pattern recognition algorithms to create a self-curated reference library. We compared smoothed and baseline-corrected spectra generated from unknown field-collected tick samples against the mean spectra from a subset of the same field samples that had already been identified through traditional means. To systematize this process, we used PCA and LDA algorithms to classify mass spectra without prior establishment of a high-quality reference library, which typically requires laboratory-reared specimens that may not be possible to obtain for all species. Global positive identification rates of up to 94.2% were achieved with this methodology, offering a rapid, reliable and objective approach to identify hard tick species, which will likely improve as more specimens are evaluated and included in our database. These outcomes agree with our previous work [27] in which we used a similar approach to classify field-collected samples of 11 morphologically-identified species of Anopheles mosquitoes. In that study, Neotropical Anopheles samples were stored dry in silica gel at—20°C, which seemed to avoid sample degradation and maintain spectral quality. This contrasts with the present study, where most of our specimens were stored in ethanol at -20°C for several years. Thus, our findings confirm that our novel analytical approach using MALDI and PCA/LDA clustering algorithms is robust for species classification regardless of the arthropod assemblage, sample storing conditions, and the lack of a high-quality reference library. In fact, the percentage of automatic spectra acquisition from the processed tick species was much higher (Table 1) than that obtained in our previous publication using mosquitoes, which ranged from 41.8% in Anopheles albimanus to 70.3% in Anopheles triannulatus [27]. Our results herein also show that both classification algorithms, PCA and LDA, were capable of clustering and recognizing spectra from up to 18 different tick species, including roughly 50% of ixodid taxa (e.g., both ecologically dominant and rare species) reported for Panama [27, 42]. LDA outcomes were more discriminant and robust than PCA overall, but PCA also classified species from different genera with over 91% accuracy and consistency. LDA was able to cluster each of the 18 species of ticks with validation and cross-validation scores above 94%, both between and within genera. As expected, the clustering algorithm was most accurate for distinctly related phylogenetic species (i.e., Ixodes, Rhipicephalus and Haemaphysalis genera), with higher than 97% success rate in most of these cases, than for closely related species (i.e., Amblyomma genus). However, A. dissimile and D. nitens depicted only moderate to low positive identification rates. Although this could be due to assemblage specific signals (i.e., high protein variability of conspecifics within these taxa), sample degradation and contamination, or technical errors such as spotting errors cannot be ruled out entirely. Future studies will have to corroborate the findings regarding these two species. Although the number of samples analyzed for some ixodid species was relatively low, several of these taxa are considered cryptic species complexes [43] and have been implicated as vectors of human pathogens in Panama as well as more broadly, including A. mixtum (cajennense) and D. nitens, the likely vectors of Rickettsia rickettsii, known to cause Rocky Mountain spotted fever [44]. We also included samples of A. tapirellum, A. oblongoguttatum and H. juxtakochi, three species from which human pathogens have been previously isolated [45], such as: Coxiella-related bacteria, whose member C. burnetii can cause Q fever; Ehrlichia, which causes ehrlichiosis infection; and Rickettsia, which causes a variety of bacterial infections in humans and other animals. These results are important because our species identification platform can serve along with recently implemented metagenomic approaches as additional tools for health ministries in Panama and other countries, to monitor, predict and manage tick-borne zoonotic pathogens [46]. Morphological taxonomic identification of ixodid ticks can be enhanced by molecular techniques such as DNA barcoding [8, 47], but this procedure is laborious, expensive and needs a highly trained lab technician. Studies show that typical DNA barcoding costs can range from $2 to $5 per sample, with difficult-to-extract samples increasing the cost two-fold or more [48, 49]; while costs associated to MALDI species identification have been calculated to be less than $0.50 per sample, without considering the high equipment cost [50-52]. Furthermore, a comprehensive repository of DNA sequences (e.g., DNA barcodes) is needed in order to test species limits, yet only a handful of Neotropical tick species are represented in Genbank [53] or BOLD [54] repositories, which could limit identification to the most common taxa only. In addition, DNA barcoding occasionally fails to delimit species boundaries due to ambiguous evolutionary relationships among closely related tick species [47]. Modern methodologies of whole genome analysis of arthropod vectors using Illumina or Nanopore next generation sequencing platforms can be applied not only to delimit taxonomic boundaries among tick species, but also to examine vector evolution (i.e., positive selection and ecological diversification), demographic phenomena (i.e., expansion and bottlenecks) and molecular epidemiology (i.e., pathogen infection and genetic diversity). The cost of these modern technologies is decreasing rapidly, and they could quickly become a valid alternative for taxonomic studies in developing and middle-income countries of Central America, including Panama. Indeed, portable Nanopore MinION methodology can be performed at the site of interest, with a laptop computer by someone with very basic entomological knowledge, and at a very affordable price on a per-sample basis [55]. Nevertheless, the bioinformatic skills and cluster capacity to process whole genome sequences of tick samples might represent an impractical burden for some institutions in developing nations, which may not have the machinery or competency to analyze this kind of data. Moreover, using the Nanopore MinION next generation sequencing approach for the exclusive goal of achieving reliable taxonomic identification of tick species may represent an underutilized expenditure that might ultimately end up overkilling the budget of resource-limited institutions. While MALDI mass spectrometry suffers from many of the shortcomings listed for other technologies, our approach can be used to identify both field-collected vectors and the pathogens they harbor in a short period of time, with a minimal amount of tissue and without the need of expert taxonomists. Our strategy to analyze protein spectra also overcomes the drawbacks of working without a reference library to classify unknown samples. We posit that MALDI mass spectra of highly abundant proteins from arthropod tissues is a powerful tool for species identification that can be easily adapted to other biological systems. However, we also believe that this technology will be best used as a complement to the traditional barcoding technique or modern next generation sequencing methodologies, to accurately confirm species boundaries across entire arthropod communities, while considering problematic vector taxonomy and the availability of local financial resources. Developing an additional tool for rapid and accurate arthropod species identification offers further flexibility to the fluctuating budgets of the research community in Central/South America. The long-term goal of our analytical approach with MALDI is to develop a tool that can enhance currently available open-source, web-based platforms, such as MALDI UP [56], MicrobeMS [57], or Mass-Up [58]; or become a new all-in-one platform where users can upload mass spectra datasets of known specimens to increase the number of species covered (e.g., bacteria, fungi, insects) and directly test spectra from unknown specimens for identification with our clustering algorithms. This crowd-sourced approach could be more cost effective, given that it is not necessary to generate a reference library of well-curated samples. Instead, field samples can be taxonomically assigned as they arrive to the laboratory using a correctly matched protein fingerprint, while unidentified samples can be identified with traditional methods and added as new entries into the growing self-curated reference database.

Conclusions

The present study used MALDI mass spectrometry as a tool to rapidly identify Neotropical specimens of adult hard ticks that had been preserved in ethanol for several years. Our algorithms were capable of identifying specimens from the 18 tick species evaluated, based on their protein spectra “fingerprint” with up to 94% cross-validation capability. This is the first report of the protein mass spectra from the leg for most of these Neotropical tick species. Large arthropod groups such as ticks are difficult to identify with currently available strategies from commercial vendors, forcing the user to lower the “quality” bar of a positive match to enhance the percentage of correct identification. Our MALDI/self-curated library approach, although still under development and serving as an auxiliary technique to traditional identification methods (and not necessarily replacing them), would reduce considerably the number of samples that would require morphological identification or DNA barcoding. This will reduce the time and cost needed to integrate these techniques in routine surveillance programs in Neotropical regions where tick diversity remains relatively uncharacterized.

Baseline-corrected and smoothed spectra for tick specimens from the species A. calcaratum.

Major ion peaks and their molecular weights are annotated in the range of 2,000 to 20,000 m/z for all specimens. The dataset shows consistently similar protein profiles, regardless of their sex, collection date and/or sampling location. (TIF) Click here for additional data file.

Baseline-corrected and smoothed spectra for tick specimens from the species R. microplus.

Major ion peaks and their molecular weights are annotated in the range of 2,000 to 20,000 m/z for all specimens. The dataset shows consistently similar protein profiles, regardless of their sex, collection date and/or sampling location. (TIF) Click here for additional data file.

Baseline-corrected and smoothed spectra for tick specimens from the species A. tapirellum.

Major ion peaks and their molecular weights are annotated in the range of 2,000 to 20,000 m/z for all specimens. The dataset shows consistently similar protein profiles, regardless of their sex, collection date and/or sampling location. (TIF) Click here for additional data file.

Metadata of specimens and species of hard tick (e.g., Ixodidae) collected in Panama.

(XLSX) Click here for additional data file.
  54 in total

1.  Performance and cost analysis of matrix-assisted laser desorption ionization-time of flight mass spectrometry for routine identification of yeast.

Authors:  Neelam Dhiman; Leslie Hall; Sherri L Wohlfiel; Seanne P Buckwalter; Nancy L Wengenack
Journal:  J Clin Microbiol       Date:  2011-01-26       Impact factor: 5.948

2.  Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry: Mechanistic Studies and Methods for Improving the Structural Identification of Carbohydrates.

Authors:  Yin-Hung Lai; Yi-Sheng Wang
Journal:  Mass Spectrom (Tokyo)       Date:  2017-09-22

3.  Comparison of bacterial identification by MALDI-TOF mass spectrometry and conventional diagnostic microbiology methods: agreement, speed and cost implications.

Authors:  K El-Bouri; S Johnston; E Rees; I Thomas; N Bome-Mannathoko; C Jones; M Reid; B Ben-Ismaeil; A R Davies; L G Harris; D Mack
Journal:  Br J Biomed Sci       Date:  2012       Impact factor: 3.829

4.  A MinION™-based pipeline for fast and cost-effective DNA barcoding.

Authors:  Amrita Srivathsan; Bilgenur Baloğlu; Wendy Wang; Wei X Tan; Denis Bertrand; Amanda H Q Ng; Esther J H Boey; Jayce J Y Koh; Niranjan Nagarajan; Rudolf Meier
Journal:  Mol Ecol Resour       Date:  2018-04-19       Impact factor: 7.090

5.  Matrix-assisted laser desorption ionization-time of flight mass spectrometry for rapid identification of tick vectors.

Authors:  Amina Yssouf; Christophe Flaudrops; Rezak Drali; Tahar Kernif; Cristina Socolovschi; Jean-Michel Berenger; Didier Raoult; Philippe Parola
Journal:  J Clin Microbiol       Date:  2012-12-05       Impact factor: 5.948

6.  Evaluation of matrix-assisted laser desorption/ionization time of flight mass spectrometry for the identification of ceratopogonid and culicid larvae.

Authors:  I C Steinmann; V Pflüger; F Schaffner; A Mathis; C Kaufmann
Journal:  Parasitology       Date:  2012-10-19       Impact factor: 3.234

7.  Is DNA barcoding actually cheaper and faster than traditional morphological methods: results from a survey of freshwater bioassessment efforts in the United States?

Authors:  Eric D Stein; Maria C Martinez; Sara Stiles; Peter E Miller; Evgeny V Zakharov
Journal:  PLoS One       Date:  2014-04-22       Impact factor: 3.240

8.  Accurate identification of Culicidae at aquatic developmental stages by MALDI-TOF MS profiling.

Authors:  Constentin Dieme; Amina Yssouf; Anubis Vega-Rúa; Jean-Michel Berenger; Anna-Bella Failloux; Didier Raoult; Philippe Parola; Lionel Almeras
Journal:  Parasit Vectors       Date:  2014-12-02       Impact factor: 3.876

9.  Molecular and MALDI-TOF identification of ticks and tick-associated bacteria in Mali.

Authors:  Adama Zan Diarra; Lionel Almeras; Maureen Laroche; Jean-Michel Berenger; Abdoulaye K Koné; Zakaria Bocoum; Abdoulaye Dabo; Ogobara Doumbo; Didier Raoult; Philippe Parola
Journal:  PLoS Negl Trop Dis       Date:  2017-07-24

10.  Recent Advances and Ongoing Challenges in the Diagnosis of Microbial Infections by MALDI-TOF Mass Spectrometry.

Authors:  Walter Florio; Arianna Tavanti; Simona Barnini; Emilia Ghelardi; Antonella Lupetti
Journal:  Front Microbiol       Date:  2018-05-29       Impact factor: 5.640

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.