Valério G Barauna1, Maneesh N Singh2, Leonardo Leal Barbosa1, Wena Dantas Marcarini1, Paula Frizera Vassallo1,3, Jose Geraldo Mill1, Rodrigo Ribeiro-Rodrigues4, Luciene C G Campos5, Patrick H Warnke6,7, Francis L Martin2. 1. Department of Physiological Sciences, Federal University of Espírito Santo, 29075-910 Vitoria, Brazil. 2. Biocel UK Ltd., 15 Riplingham Road, West Ella, Hull HU10 6TS, U.K. 3. Clinical Hospital, Federal University of Minas Gerais, 31270-901 Belo Horizonte, Brazil. 4. Núcleo de Doenças Infecciosas, Federal University of Espírito Santo, 29075-910 Vitoria, Brazil. 5. Department of Biological Science, Santa Cruz State University, 45662-900 Bahia, Brazil. 6. Praxisklinik am Ballastkai, Ballastkai 5, 24937 Flensburg, Germany. 7. Department of OMF-Surgery, Christian-Albrechts-University of Kiel, 24118 Kiel, Germany.
Abstract
There is an urgent need for ultrarapid testing regimens to detect the severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2] infections in real-time within seconds to stop its spread. Current testing approaches for this RNA virus focus primarily on diagnosis by RT-qPCR, which is time-consuming, costly, often inaccurate, and impractical for general population rollout due to the need for laboratory processing. The latency until the test result arrives with the patient has led to further virus spread. Furthermore, latest antigen rapid tests still require 15-30 min processing time and are challenging to handle. Despite increased polymerase chain reaction (PCR)-test and antigen-test efforts, the pandemic continues to evolve worldwide. Herein, we developed a superfast, reagent-free, and nondestructive approach of attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy with subsequent chemometric analysis toward the prescreening of virus-infected samples. Contrived saliva samples spiked with inactivated γ-irradiated COVID-19 virus particles at levels down to 1582 copies/mL generated infrared (IR) spectra with a good signal-to-noise ratio. Predominant virus spectral peaks are tentatively associated with nucleic acid bands, including RNA. At low copy numbers, the presence of a virus particle was found to be capable of modifying the IR spectral signature of saliva, again with discriminating wavenumbers primarily associated with RNA. Discrimination was also achievable following ATR-FTIR spectral analysis of swabs immersed in saliva variously spiked with virus. Next, we nested our test system in a clinical setting wherein participants were recruited to provide demographic details, symptoms, parallel RT-qPCR testing, and the acquisition of pharyngeal swabs for ATR-FTIR spectral analysis. Initial categorization of swab samples into negative versus positive COVID-19 infection was based on symptoms and PCR results (n = 111 negatives and 70 positives). Following training and validation (using n = 61 negatives and 20 positives) of a genetic algorithm-linear discriminant analysis (GA-LDA) algorithm, a blind sensitivity of 95% and specificity of 89% was achieved. This prompt approach generates results within 2 min and is applicable in areas with increased people traffic that require sudden test results such as airports, events, or gate controls.
There is an urgent need for ultrarapid testing regimens to detect the severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2] infections in real-time within seconds to stop its spread. Current testing approaches for this RNA virus focus primarily on diagnosis by RT-qPCR, which is time-consuming, costly, often inaccurate, and impractical for general population rollout due to the need for laboratory processing. The latency until the test result arrives with the patient has led to further virus spread. Furthermore, latest antigen rapid tests still require 15-30 min processing time and are challenging to handle. Despite increased polymerase chain reaction (PCR)-test and antigen-test efforts, the pandemic continues to evolve worldwide. Herein, we developed a superfast, reagent-free, and nondestructive approach of attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy with subsequent chemometric analysis toward the prescreening of virus-infected samples. Contrived saliva samples spiked with inactivated γ-irradiated COVID-19 virus particles at levels down to 1582 copies/mL generated infrared (IR) spectra with a good signal-to-noise ratio. Predominant virus spectral peaks are tentatively associated with nucleic acid bands, including RNA. At low copy numbers, the presence of a virus particle was found to be capable of modifying the IR spectral signature of saliva, again with discriminating wavenumbers primarily associated with RNA. Discrimination was also achievable following ATR-FTIR spectral analysis of swabs immersed in saliva variously spiked with virus. Next, we nested our test system in a clinical setting wherein participants were recruited to provide demographic details, symptoms, parallel RT-qPCR testing, and the acquisition of pharyngeal swabs for ATR-FTIR spectral analysis. Initial categorization of swab samples into negative versus positive COVID-19infection was based on symptoms and PCR results (n = 111 negatives and 70 positives). Following training and validation (using n = 61 negatives and 20 positives) of a genetic algorithm-linear discriminant analysis (GA-LDA) algorithm, a blind sensitivity of 95% and specificity of 89% was achieved. This prompt approach generates results within 2 min and is applicable in areas with increased people traffic that require sudden test results such as airports, events, or gate controls.
In
early 2020, a new strain of coronavirus called severe acute
respiratory syndrome coronavirus 2 (SARS-CoV-2), more commonly known
as causing the COVID-19 disease, gave rise to a global pandemic.[1] Starting from an epidemic outbreak in Wuhan (China),
the virus quickly spread westwards toward Europe and the USA[2] with serious health and socioeconomic consequences
worldwide.[3] SARS-CoV-2 exhibits a high
propensity for infectious spread throughout populations.[2] Every COVID-19 positive case, if not contained,
can readily spread to two or more people giving a virulent R number.[4] Some countries, such as South Korea, initially
successfully fought the COVID-19 outbreak. This is based on the key
aspects[5] of: (a) prevention, via good cleaning
practices and isolation of potential cases; (b) testing, to identify
those infected and to precisely isolate risk cases; and (c) antiviral
treatment and, in the future, a vaccine. Testing is fundamental to
identify infectedpeople and regions of risk.[6] This can enable intelligent isolation of areas without affecting
an entire country’s economy and allow allocation of resources
to more strategically fight the disease, with more ventilators, medication,
and medical staff assigned to regions with more diagnosed cases.The main challenges for testing are the cost and, in particular,
time required for each test result. Gold-standard diagnosis by RT-qPCR
is costly with a shortage of testing facilities even in developed
countries and can take >2 days to get the result because specimens
have to be transported for processing to often distant laboratories.[7] This is not suitable for mass testing.[8] Despite globally increased polymerase chain reaction
(PCR)-test efforts, the pandemic was not brought to a halt. In contrast,
there is a recurrence and second wave of the disease because many
infectious patients spread the disease while waiting on their PCR-test
results. There are some companies developing quicker and lower-cost
tests based on novel sensors.[9] Alternative
antigen- or antibody-detection approaches remain unproven, again maybe
creating a statistical bias that could directly affect public health
policies.[10] Whilst initial reports claim
a sensitivity of 100%, more recent studies exhibit levels as low as
30%;[11] others report 72 or 81% in different
settings.[12] At this stage, experimental
claims of sensitivity and specificity for antigen tests remain to
be robustly proven.[13] Thus, there is still
a need to develop COVID-19 test approaches that can deliver results
in real-time and on-site.Vibrational spectroscopy, including
attenuated total reflection
Fourier-transform infrared (ATR-FTIR) spectroscopy, has been widely
used to discriminate and classify normal and pathological populations
using different cell types, tissues, or biofluids.[14−16] Readily accessible
biofluids, such as blood plasma/serum, saliva, or urine, are considered
ideal for clinical implementation due to routine methods of collection,
as well as minimal sample preparation.[17] Interrogation of samples with infrared (IR) spectroscopic techniques
allows for the generation of a “spectral fingerprint”,
which subsequently facilitates the discrimination of the different
populations and identification of potential biomarkers.[18] In the past few years, biofluid-based ATR-FTIR
spectroscopy has been used for diagnosing, screening, or monitoring
the progression/regression in a variety of diseases.[19] Spectroscopic techniques are rapid, cost-effective, and
nondestructive, which make them a perfect candidate for translation
to clinic, even as an adjunct to more established methods.As
a readily accessible noninvasive biofluid, saliva is an ideal
candidate to facilitate disease detection; indeed, oral health has
long been known to be an indicator of whole organism health.[20] Herein, ATR-FTIR spectroscopy was used to interrogate
saliva samples on pharyngeal swabs taken from individuals with or
without suspected infection with COVID-19. Unlike many tests developed
using laboratory-based contrived specimens, we trialed the approach
in clinical settings on real-world samples. Our goal was to differentiate
individuals with active infection based on a series of spectral biomarkers.
We also took into consideration symptoms and other demographic features
of our participants as confounding factors (see the Supporting Information). We propose a new, ultrarapid on-site
method to detect COVID-19 based on pharyngeal swabs using IR light,
with potential for ready implementation in general population settings.
This approach is not designed to replace existing diagnostic methods
such as RT-qPCR but to serve as a rapid prescreening tool to allow
the ready movement of population interactions typical of an open economy.
Methods
Ethical
Approval
This study was carried out in agreement
with the Helsinki declaration and authorized by the Hospitals Directive
due to the emergency situation.[21] Ethical
approval for the investigation was granted by the Ethics Committee
Federal University of Espírito Santo (#0993920.1.0000.5071
and #31411420.9.0000.8207). Full ethical approval was given to undertake
the studies described herein. All procedures and possible risks were
explained to participants before they provided written consent.
Participant Recruitment and Swab Collection
Pharyngeal
cotton swabs (FirstLab, Brazil) were from individuals who came to
one of the six hospitals participating in the study and met the criteria
for suspected cases according to the State Health Secretary and World
Health Organization (WHO) guidelines between June and September 2020.
For all participants, demographic data (age, gender, pre-existing
medical conditions, symptoms, and date of symptoms’ onset)
were collected (see the Supporting Information). Exclusion criteria from this study were those with inconclusive
RT-qPCR results after two rounds of RT-qPCR.For the gold-standard
protocol via diagnosis by RT-qPCR, a nasopharyngeal swab was collected
from participants by inserting a rayon swab with a plastic shaft into
the nostril parallel to the palate. The swab was inserted to a location
equidistant from the nostril and the outer opening of the ear and
was gently scraped for a few seconds to absorb secretions. The swab
was then placed immediately into a sterile tube containing a viral
transport medium. RT-qPCR was performed in the Central Laboratory
from the Health Secretary of Espírito Santo (LACEN-SESA) to
allow definitive diagnosis of COVID-19infection.For ATR-FTIR
spectroscopy, a pharyngeal swab was collected from
participants by inserting a cotton swab into the mouth and scrapping
the tonsils, the tongue, and the inner part of the cheek. The swab
was then placed immediately into a sterile tube and stored on ice
until analysis.
Parallel RT-qPCR Testing
Samples
were taken simultaneously
as nasopharyngeal swabs for PCR testing. In the clinical setting,
all PCRs were locally or nationally approved tests. All samples were
analyzed at the same state-approved laboratory.Nucleic acid
extraction and real-time RT-qPCR for virus detection were performed
to allow the identification of SARS-CoV-2. The extraction of total
nucleic acid (DNA and RNA) from collected samples was performed using
the BioGene Extraction kit (Bioclin, K204-4, Brazil), following the
manufacturer’s instructions. Specimens were handled under the
laboratory biosafety guidance required for the novel coronavirus (2019-nCoV)
designated by WHO at the Central Laboratory of the Espírito
Santo state (LACEN-ES). A combination of four tests was employed to
detect viral RNA. The first was using the IDT (Integrated DNA Technologies;
Coralville, IA) kit, which is developed in association with the CDC
and employs primers and probes for the N1, N2, and RP genes. The second
was Maccura (designed by Maccura Biotechnology Co., Hi-tech Zone,
Chengdu, China), which is a single-well triple target assay and identifies
three genes from SARS-CoV-2 (E, N, and ORF1ab) and provides a separate
positive internal control (IC). The third was the Molecular SARS-CoV-2
(E/RP genes) kit (Instituto de Tecnologia em Imunobiológicos,
Bio-Manguinhos, FioCruz, RJ, Brazil), which uses primers and probes,
as reported by Corman et al.[22] The detection
of viral RNA was carried out on an ABI 7500 real-time PCR machine
(Applied Biosystems, Weiterstadt, Germany) using the published protocol
and the sequence of primers and probe for E gene and RNAse P. Lastly,
the IBMP (Instituto de Biologia Molecular do Paraná—FioCRUZ)
kit was employed, which is a single-well test and detects N and ORF1ab
genes, and uses the RP gene as an internal control.All assays
were performed using manufacturers’ recommendations.
First, all samples were tested in a single-well assay (IBMP or Maccura)
for the qPCR run, and interpretations of all results were added to
a spreadsheet, together with the values of Cts obtained. Samples with
inconclusive results, either by nonamplification in the internal control
or by nonamplification of another gene, were tested with the other
two qPCR kits (IDT or Bio-manguinhos gene E). If the PCR result remained
inconclusive, the result and Ct values were added to the spreadsheet
as a negative.
ATR-FTIR Spectral Analyses of Pharyngeal
Swabs
FTIR
spectra data (wavenumber range 4000–650 cm–1) for each swab (see Table S1) were obtained
by directly placing the saliva swab on a portable Agilent Cary 630
FTIR Spectrometer equipped with an ATR ZnSe crystal (Agilent, Santa
Clara, CA) and Microlab PC software run from a dedicated laptop. Each
whole spectrum contains 1798 points (1.86 cm–1 spectral
resolution). For every ATR-FTIR spectroscopic measurement, three spectra
were obtained from each saliva swab. Each swab analysis was performed
with 32 coadditions, interspersed with 32 background scans. After
each analysis, the swab was removed from the crystal and the crystal
was cleaned with MilliQ water and 70% alcohol, thus avoiding intersample
contamination. The robustness of this analytical method was trialed
in a 2 week demonstration in Kiel (September 2020) during which 625
pharyngeal swabs were taken and analyzed using ATR-FTIR spectroscopy.
Only three swab analyses in the entire spectral data set generated
outliers (data not shown).
Spiking Experiments to Determine the Limit
of Detection (LoD)
SARS-CoV-2 virus-particle stock solutions
were a generous donation
from Prof. Viviane F. Botosso (Instituto Butantan, Brazil) and Prof.
Edison L. Durigon (Department of Microbiology, Universidade de São
Paolo, Brazil). In brief, a SARS-CoV-2 isolate [HIAE-02: SARS-CoV-2/SP02/human/2020/BRA
(GenBank accession number MT 126808.1)] was employed. The virus
was propagated in African green monkey kidney Vero cells (ATCC CCL-81)
maintained in a Dulbecco’s modified Eagle medium (DMEM) supplemented
with 5% fetal bovine serum (FBS, Gibco), 1% nonessential amino acids
(NEAA), 1% sodium pyruvate (Sigma-Aldrich Co., Deisenhofen, Germany),
and incubated in a humidified atmosphere at 37 °C and 5% CO2. The viral particle was isolated from the supernatant. To
clarify and remove cellular waste or debris in the supernatant, centrifugation
and diafiltration cycles were performed. Aliquots of the clear supernatant
were transferred to cryogenic tubes and stored at −80 °C
in a freezer prior to irradiation treatment. The full protocol for
γ-irradiated inactivated virus particle remains to be published.For spiking experiments, γ-irradiated inactivated SARS-CoV-2
virus particles (from a stock solution of 1 × 105 copies/mL
deionized water) were mixed in various copy number concentrations
in saliva taken from a 42-year-old male classified as a negative for
infection. The following protocols were undertaken:γ-Radiation-inactivated
COVID-19
virus solution (4 μl) was applied to the ATR diamond and allowed
to dry for 4–5 min. Then, serial dilutions of the virus in
deionized water were analyzed in a similar fashion.A series of serial dilutions in saliva
from a negative study participant were generated and applied to the
ATR diamond and allowed to dry for 4–5 min.Saliva (15 μl) spiked with γ-radiation-inactivated
virus (step (2)) was added to a cotton swab. The saliva cotton swab
was then applied straight to the ATR diamond and immediately analyzed.
Data Preprocessing and Analysis
Preprocessing[23] and data analysis were
carried out using MATLAB
2014b (The Math Works, MA). Data analysis was performed using three
MATLAB toolboxes: PLS Toolbox version 7.9.3 for preprocessing (Eigenvector
Research, Inc.), GA-LDA for feature selection (available at https://doi.org/10.6084/m9.figshare.3479003.v1), and the Classification Toolbox for MATLAB used for graphical outputs
of an LDA algorithm (available at https://michem.unimib.it/download/matlab-toolboxes/classification-toolbox-for-matlab/).[24] The spectra were preprocessed by
truncating the fingerprint region (1800–900 cm–1), followed by Savitzky–Golay smoothing (9 point window, 2nd
order polynomial fitting), automatic weighted least-squares baseline
correction, and vector normalization. The triplicate replicate spectra
per sample were averaged before model construction. Toward exploratory
data analyses, following preprocessing of raw spectra, spectral data
were mean-centered and evaluated by means of principal component analysis
(PCA).[25] PCA is an unsupervised technique
that reduces the spectral data space to principal components (PCs)
responsible for the majority of variance in the original data set.
Each PC is orthogonal to each other, where the first PC accounts to
the maximum explained variance followed by the second PC and so on.
The PCs are composed of scores and loadings, where the first represents
the variance in the sample direction, thus used to assess similarities/dissimilarities
among the samples; the latter represents the contribution of each
variable for the model decomposition, thus used to find important
spectral markers. This technique looks for inherent similarities/differences
and provides a scores matrix representing the overall “identity”
of each sample, a loading matrix representing the spectral profile
in each PC, and a residual matrix containing the unexplained data.
Scores information can be used for exploratory analysis providing
possible classification between data classes.PCA was the method
of choice for analyzing saliva samples spiked with an inactivated
virus particle. It is simple, fast, and combines exploratory analysis,
data reduction, and feature extraction into one single method. PCA
scores were used to explore overall data set variance and any clustering
related to the limit of detection, while the loadings on the first
two PCs were used to derive specific biomarkers indicative of the
infection category.A genetic algorithm (GA) is a variable selection
technique used
to reduce the spectral data space into a few variables and works by
simulating the data throughout an evolutionary process.[26,27] The original space is maintained for both algorithms and no transformation
is made as in PCA. Therefore, the selected variables have the same
meaning of the original ones (i.e., wavenumbers), and they are responsible
for the region where there are more differences between the classes
being analyzed or, in other words, between the chemical changes.For all classification models, samples were divided into training
(50 designated negative and 50 designated positive for COVID-19infection
based on symptoms and RT-PCR; see Tables S2 and S3) and validation (n = 61 designated negative
and 20 designated positive for COVID-19infection based on symptoms
and RT-PCR) sets by applying the Kennard–Stone (KS) uniform
sampling selection algorithm.[23] The training
samples were used in the modeling procedure, whereas the prediction
set was only used in the final classification evaluation using the
LDA discriminant approach. The optimal number of variables for GA
was determined with an average risk G of LDA misclassification.
Such cost function is calculated in a subset of the training set aswhere g is defined
aswhere the numerator
is the squared Mahalanobis
distance between the object x and the
sample mean m of its true class, and the denominator is the squared Mahalanobis
distance between the object x and the
mean of the closest wrong class.[25,28]The
GA calculations were performed during 100 generations with
200 chromosomes each. One-point crossover and mutation probabilities
were set to 60 and 10%, respectively. GA is a nondeterministic algorithm,
which can give different results by running the same equation/model.
Therefore, the algorithm was repeated three times, starting from random
initial populations, with the best solution resulting from the three
realizations of GA employed.Sensitivity (the probability that
a test result will be positive
when the disease is present) and specificity (the probability that
a test result will be negative when the disease is not present) were
given by the following equationswhere TP is defined as true positive, FN as
false negative, TN as true negative, and FP as false positive.
Results
and Discussion
Spiking of Saliva with Inactivated COVID-19
Virus
Figure a shows a typical
spectrum of inactivated γ-irradiated COVID-19 virus particles
[eSARS.CoV2/SP02.2020.HIAE.Br (GenBank accession number MT 126808.1)];[21] at 1582 copies/mL, an ATR-FTIR spectrum
with a good signal-to-noise ratio (SNR) is obtained. This was to assess
the limit of detection (LoD) for biospectroscopy to ascertain the
minimum concentration at which the virus could be detected by IR spectroscopy.
Below this level, the SNR becomes poor and noisy. This clearly points
to the ability of ATR-FTIR spectroscopy to extract a unique viral
fingerprint consistent of spectral features associated with a pure
virus spectrum. It is interesting to note that the predominant spectral
peaks can be tentatively associated with nucleic acid bands, including
RNA. Following nucleic acid (RNA/DNA) extraction of saliva samples
obtained from participants either positive (n = 5)
or negative (n = 5) for COVID-19, clear segregation
of spectral data points is obtained using exploratory principal component
analysis (PCA) (Figure b).
Figure 1
Preliminary SARS-CoV-2 analyses employing ATR-FTIR spectroscopy.
(a) Spectra of pure virus in saliva (whole virus inactivated by γ-radiation)
[e SARS.CoV2/SP02.2020.HIAE.Br (GenBank accession number MT 126808.1)]. (b) Graphical demonstration of separation in a PCA scores plot
of positive and negative samples in RNA-extracted samples prepared
for PCR analyzed by biospectroscopy.
Preliminary SARS-CoV-2 analyses employing ATR-FTIR spectroscopy.
(a) Spectra of pure virus in saliva (whole virus inactivated by γ-radiation)
[eSARS.CoV2/SP02.2020.HIAE.Br (GenBank accession number MT 126808.1)]. (b) Graphical demonstration of separation in a PCA scores plot
of positive and negative samples in RNA-extracted samples prepared
for PCR analyzed by biospectroscopy.Control saliva from a humanparticipant (male, 42 years and RT-qPCR
negative) was spiked with various numbers of inactivated γ-irradiated
COVID-19 virus particles (Figure ). At low copy numbers, the virus particle is clearly
capable of modifying the IR spectral signature of saliva (Figure a,b). Examination
of control saliva in comparison with saliva-spiked inactivated virus
particle at various copy number levels highlighted an ability to detect
virus particle-induced spectral alterations at levels that would be
considered extremely low in the pharyngeal cavity of infectedhumans
(symptomatic or asymptomatic). Even more compellingly, when this is
examined using basic multivariate analysis (i.e., PCA), the IR spectral
signature of pure inactivated virus segregates away from control saliva
in a scores plot (Figure c). When saliva is spiked with exceptionally low levels of
virus (781 copies/mL; a1 cluster below), the spectral points cocluster
with control saliva spectral points, suggesting no differences. However,
at a level of 12 500 copies/mL (a2 cluster below), there is
segregation from the control. It is critical to note that the loading
plot specifically identifies RNA as being proportional to virus levels
(Figure b). The loadings
on PC1 show the bands responsible for the increase in the virus concentration
(nucleic acid bands), and the loadings on PC2 show the bands responsible
for discrimination between saliva and virus (amide I and amide II
bands present in saliva but not virus). Other, primarily protein-associated
bands discriminate the saliva from the virus—we believe this
to be the first report of its kind using biospectroscopy.
Figure 2
Spiking of
control saliva with SARS-CoV-2 virus and analyses using
ATR-FTIR spectroscopy. (a) Average raw spectra and (b) preprocessed
spectra for saliva (n = 2), pure SARS-CoV-2 virus
in different concentrations (n = 28, 1 × 105–98 copies/mL), and saliva + virus in different concentrations
(n = 63, 1 × 105–24 copies/mL).
(c) PCA scores and (d) PCA loadings on PC1 versus PC2 for the preprocessed
data. Insets: (c1) mix between saliva and saliva + virus for low concentration
(≤781 copies/mL) and (c2) mix between pure virus and saliva
+ virus for high concentration (≥1.25 × 104 copies/mL). Preprocessing: Savitzky–Golay (SG) smoothing
(7 point window, 2nd order polynomial fitting) and baseline correction.
The loadings on PC1 show the bands responsible for the increase in
the virus concentration (nucleic acid bands), and the loadings on
PC2 show the bands responsible for discrimination between saliva and
virus (amide I and amide II bands present in saliva but not virus).
Spiking of
control saliva with SARS-CoV-2 virus and analyses using
ATR-FTIR spectroscopy. (a) Average raw spectra and (b) preprocessed
spectra for saliva (n = 2), pure SARS-CoV-2 virus
in different concentrations (n = 28, 1 × 105–98 copies/mL), and saliva + virus in different concentrations
(n = 63, 1 × 105–24 copies/mL).
(c) PCA scores and (d) PCA loadings on PC1 versus PC2 for the preprocessed
data. Insets: (c1) mix between saliva and saliva + virus for low concentration
(≤781 copies/mL) and (c2) mix between pure virus and saliva
+ virus for high concentration (≥1.25 × 104 copies/mL). Preprocessing: Savitzky–Golay (SG) smoothing
(7 point window, 2nd order polynomial fitting) and baseline correction.
The loadings on PC1 show the bands responsible for the increase in
the virus concentration (nucleic acid bands), and the loadings on
PC2 show the bands responsible for discrimination between saliva and
virus (amide I and amide II bands present in saliva but not virus).Furthermore, in the complex milieu of a saliva
sample, which will
undoubtedly contain a range of complex constituents including aqueous,
exfoliated cellular material, and postinfection immunoglobulins such
as IgA and other individual or contaminating factors, a multivariate
chemometric approach can still extract the viral-associated discriminating
features. Following this, Figure shows the analysis of swabs spiked with either saliva
with or without spiking with γ-irradiated COVID-19 virus particles. Figure a,b shows spectra
with a good SNR. In consequent PCA scores plots, the spectral data
points for virus-spiked saliva swabs segregate away from swab or control
saliva swab categories (Figure c). This is achieved at low copy numbers. The loadings on
PC1 show the bands responsible for separation between swab + saliva
and swab + saliva + virus (amide I and amide II band of proteins)
and the loadings on PC2 show the bands responsible for variation of
virus concentration (amide I, amide II, and nucleic acids bands) (Figure d). Different from
saliva, the swab sample contains bands on the nucleic acids region
plus amide I and amide II that may come from the saliva itself.
Figure 3
Spiking of
swab with saliva or saliva + SARS-CoV-2 virus prior
to analyses using ATR-FTIR spectroscopy. (a) Average raw spectra and
(b) preprocessed spectra for swab + saliva (n = 5)
and swab + saliva + virus (n = 54, 1 × 105–98 copies/mL). (c) PCA scores and (d) PCA loadings
on PC1 versus PC2 for the preprocessed data. Insets: (c1) virus concentration
around 6.25 × 103 copies/mL, (c2) virus concentration
around 1.56 × 103 copies/mL;, and (c3) virus concentration
≤ 781 copies/mL. Preprocessing: Savitzky–Golay (SG)
smoothing (7 point window, 2nd order polynomial fitting) and baseline
correction. The loadings on PC1 show the bands responsible for separation
between swab + saliva and swab + saliva + virus (amide I and amide
II band of proteins) and the loadings on PC2 show the bands responsible
for variation of virus concentration (amide I, amide II, and nucleic
acids bands). Different from saliva, the swab sample contains bands
on the nucleic acids region plus amide I and amide II that may come
from the saliva itself.
Spiking of
swab with saliva or saliva + SARS-CoV-2 virus prior
to analyses using ATR-FTIR spectroscopy. (a) Average raw spectra and
(b) preprocessed spectra for swab + saliva (n = 5)
and swab + saliva + virus (n = 54, 1 × 105–98 copies/mL). (c) PCA scores and (d) PCA loadings
on PC1 versus PC2 for the preprocessed data. Insets: (c1) virus concentration
around 6.25 × 103 copies/mL, (c2) virus concentration
around 1.56 × 103 copies/mL;, and (c3) virus concentration
≤ 781 copies/mL. Preprocessing: Savitzky–Golay (SG)
smoothing (7 point window, 2nd order polynomial fitting) and baseline
correction. The loadings on PC1 show the bands responsible for separation
between swab + saliva and swab + saliva + virus (amide I and amide
II band of proteins) and the loadings on PC2 show the bands responsible
for variation of virus concentration (amide I, amide II, and nucleic
acids bands). Different from saliva, the swab sample contains bands
on the nucleic acids region plus amide I and amide II that may come
from the saliva itself.
GA-LDA Segregation of Categories:
COVID-19 Infected versus Uninfected
Categorization into the
negative (designated not infected by COVID-19)
and positive (designated as infected) categories was based on a series
of RT-qPCR tests, primarily carried out at Central Laboratory of Espírito
Santo State alongside symptoms/outcome (see Tables S2 and S3). Follow-up showed that COVID-19participants required
hospitalization. A Ct < 37 in RT-qPCR designated a PCR-positive
result (Figure ; see
the Supporting Information). Metaparameters
such as gender, age, or smoking habits were not incorporated into
model construction; some participant details such as the presence
or the absence of symptoms (see Tables S2 and S3) were used for designation of negative versus positive categories.
Figure 4
RT-qPCR
of samples. A tiered RT-qPCR system was employed for the
analyses of parallel nasopharyngeal samples taken from study participants.
(a) The samples were considered positive if the E gene was amplified
with Ct < 37. (b) A standard curve is below alongside a negative
control. (c) The efficiency curve to determine the threshold. (d)
Three positive samples are shown juxtaposed with three negative samples
(no amplification) and three with a Ct > 37 (negative/inconclusive).
RT-qPCR
of samples. A tiered RT-qPCR system was employed for the
analyses of parallel nasopharyngeal samples taken from study participants.
(a) The samples were considered positive if the E gene was amplified
with Ct < 37. (b) A standard curve is below alongside a negative
control. (c) The efficiency curve to determine the threshold. (d)
Three positive samples are shown juxtaposed with three negative samples
(no amplification) and three with a Ct > 37 (negative/inconclusive).The chemometric technique of genetic algorithm-linear
discriminant
analysis (GA-LDA) was applied toward classification[25,28,29] of negative versus positive for COVID-19infection (Figure ). The classification ratios achieved after GA-LDA were a sensitivity
of 95% and specificity of 89% (Table ). Figure a,b shows the full raw spectra across the entire mid-IR spectral
range and raw spectra in the fingerprint region for all negative and
COVID-19 positive swab samples (n = 111 negatives
and 70 positives). Spectra were preprocessed [Savitzky–Golay
smoothing (9 point window, 2nd order polynomial fitting), automatic
weighted least-squares baseline correction, and vector normalization]
in the fingerprint region (Figure c). Training and validation of GA-LDA were undertaken
using 50 negatives and 50 positives; the GA-LDA scores plot for the
validation set (n = 61 negatives and 20 positives)
is shown in Figure d.
Figure 5
Analyses of pharyngeal swabs using ATR-FTIR spectroscopy in a clinical
setting. (a) Full raw spectra and (b) raw spectra in the fingerprint
region for all negative and COVID-19 positive swab samples (n = 111 negatives and 70 positives). (c) Preprocessed spectra
[Savitzky–Golay smoothing (9 point window, 2nd order polynomial
fitting), automatic weighted least-squares baseline correction, and
vector normalization] in the fingerprint region and the (d) GA-LDA
score plot for the validation set (n = 61 negatives
and 20 positives).
Table 1
Confusion
Matrix Showing the Number
of Patients and Figures of Merit for the Validation Set Using GA-LDA
Algorithm Preprocessinga
baseline correction and vector normalizationb
predicted
negative
predicted
positive
negative
54
7
positive
1
19
parameters
accuracy
90% (95% CI, 76–97%)
sensitivity
95% (95% CI, 73–100%)
specificity
89% (95% CI, 77–95%)
F-score
92% (95% CI, 75–97%)
95% CI:
95% confidence interval.
Preprocessing: Savitzky–Golay
smoothing (9 point window, 2nd order polynomial fitting), automatic
weighted least-squares baseline correction, and vector normalization.
Analyses of pharyngeal swabs using ATR-FTIR spectroscopy in a clinical
setting. (a) Full raw spectra and (b) raw spectra in the fingerprint
region for all negative and COVID-19 positive swab samples (n = 111 negatives and 70 positives). (c) Preprocessed spectra
[Savitzky–Golay smoothing (9 point window, 2nd order polynomial
fitting), automatic weighted least-squares baseline correction, and
vector normalization] in the fingerprint region and the (d) GA-LDA
score plot for the validation set (n = 61 negatives
and 20 positives).95% CI:
95% confidence interval.Preprocessing: Savitzky–Golay
smoothing (9 point window, 2nd order polynomial fitting), automatic
weighted least-squares baseline correction, and vector normalization.Consequently, five GA-LDA-selected
variables were identified, each
which significantly (P < 0.01) discriminates negative
and COVID-19 positive swab samples (Figure and Table ). Using saliva swab-based vibrational spectroscopy,
we achieved results with significant clinical relevance. ATR-FTIR
spectroscopy has been proven to be capable of distinguishing between
patient and healthy group negative and COVID-19 positive swab samples.
A plausible mechanistic basis for this is that the prominent distinguishing
features extracted are primarily associated with nucleic acids, RNA
in particular. Four of five molecular tentative RNA assignments are
higher in the negative group compared to the positive category. A
plausible hypothesis is that the 1429 cm–1 increase
is associated with a virus, e.g., a simple RNA virus. The corresponding
decreases at 1220, 1084, 1069, and 1041 cm–1 may
be associated with a response of the host organism to the virus infection.
Figure 6
GA-LDA
selected variables. Arrow ↑: higher absorbance in
the COVID-19 positive class. Arrow ↓: higher absorbance in
the COVID-19 negative class. P-value calculated using
a MANOVA test with all five GA-LDA-selected variables between all
negative and positive samples.
Table 2
Selected Variables by GA-LDA With
Their Respective Tentative Assignments
variable
(cm–1)
tentative
assignment
1429
δ(CH2)
polysaccharides
1220
asymmetric PO2– stretching
in RNA and DNA
1084
symmetric PO2– stretching
in nucleic acids
1069
C–O stretching in
ribose
1041
symmetric
PO2– stretching
in nucleic acids
GA-LDA
selected variables. Arrow ↑: higher absorbance in
the COVID-19 positive class. Arrow ↓: higher absorbance in
the COVID-19 negative class. P-value calculated using
a MANOVA test with all five GA-LDA-selected variables between all
negative and positive samples.Future work will extend toward validation for regulatory approval.
We believe that existing spectrometers and current manufacturing capacity
of HeNe lasers will be sufficient for the significant rollout of our
technology in the event of it being adopted. Even with the vaccines
(efficacy and safety remain to be proven), which are being adopted
and the possible emergence of herd immunity (given that the SARS-CoV-2
virus is a mutating RNA virus, this may not occur), there will remain
a need for risk reduction strategies to get worldwide economies moving
again. In a demonstration in September 2020 at the Kiel (Schleswig-Holstein,
Germany) Regatta, which is an annual sailing event (plus Olympic qualifier),
we trialed this method (data not shown). Over 2 weeks, some 620 pharyngeal
swabs from workers and athletes (from some 40 countries) were analyzed;
in comparison with RT-qPCR, we had 100% specificity, which excluded
the possibility of cross-reactivity. We believe this approach could
translate as a viable option in the battle against SARS-CoV-2.
Authors: Emilio Gomez-Gonzalez; Alejandro Barriga-Rivera; Beatriz Fernandez-Muñoz; Jose Manuel Navas-Garcia; Isabel Fernandez-Lizaranzu; Francisco Javier Munoz-Gonzalez; Ruben Parrilla-Giraldez; Desiree Requena-Lancharro; Pedro Gil-Gamboa; Cristina Rosell-Valle; Carmen Gomez-Gonzalez; Maria Jose Mayorga-Buiza; Maria Martin-Lopez; Olga Muñoz; Juan Carlos Gomez-Martin; Maria Isabel Relimpio-Lopez; Jesus Aceituno-Castro; Manuel A Perales-Esteve; Antonio Puppo-Moreno; Francisco Jose Garcia-Cozar; Lucia Olvera-Collantes; Raquel Gomez-Diaz; Silvia de Los Santos-Trigo; Monserrat Huguet-Carrasco; Manuel Rey; Emilia Gomez; Rosario Sanchez-Pernaute; Javier Padillo-Ruiz; Javier Marquez-Rivas Journal: Sci Rep Date: 2022-02-18 Impact factor: 4.996
Authors: Vanessa Schorer; Julian Haas; Robert Stach; Vjekoslav Kokoric; Rüdiger Groß; Jan Muench; Tim Hummel; Harald Sobek; Jan Mennig; Boris Mizaikoff Journal: Sci Rep Date: 2022-02-10 Impact factor: 4.379
Authors: Bayden R Wood; Kamila Kochan; Diana E Bedolla; Natalia Salazar-Quiroz; Samantha L Grimley; David Perez-Guaita; Matthew J Baker; Jitraporn Vongsvivut; Mark J Tobin; Keith R Bambery; Dale Christensen; Shivani Pasricha; Anthony K Eden; Aaron Mclean; Supti Roy; Jason A Roberts; Julian Druce; Deborah A Williamson; Julie McAuley; Mike Catton; Damian F J Purcell; Dale I Godfrey; Philip Heraud Journal: Angew Chem Int Ed Engl Date: 2021-06-29 Impact factor: 16.823
Authors: Lukas E Brümmer; Stephan Katzenschlager; Mary Gaeddert; Christian Erdmann; Stephani Schmitz; Marc Bota; Maurizio Grilli; Jan Larmann; Markus A Weigand; Nira R Pollock; Aurélien Macé; Sergio Carmona; Stefano Ongarello; Jilian A Sacks; Claudia M Denkinger Journal: PLoS Med Date: 2021-08-12 Impact factor: 11.069