| Literature DB >> 34250515 |
Rishi K Gupta1,2, Joshua Rosenheim2, Lucy C Bell2, Aneesh Chandran2, Jose A Guerra-Assuncao2, Gabriele Pollara2, Matthew Whelan2, Jessica Artico3, George Joy3, Hibba Kurdi3, Daniel M Altmann4, Rosemary J Boyton5, Mala K Maini2, Aine McKnight6, Jonathan Lambourne7, Teresa Cutino-Moguel8, Charlotte Manisty9,3, Thomas A Treibel9,3, James C Moon9,3, Benjamin M Chain2, Mahdad Noursadeghi2.
Abstract
BACKGROUND: We hypothesised that host-response biomarkers of viral infections might contribute to early identification of individuals infected with SARS-CoV-2, which is critical to breaking the chains of transmission. We aimed to evaluate the diagnostic accuracy of existing candidate whole-blood transcriptomic signatures for viral infection to predict positivity of nasopharyngeal SARS-CoV-2 PCR testing.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34250515 PMCID: PMC8260104 DOI: 10.1016/S2666-5247(21)00146-4
Source DB: PubMed Journal: Lancet Microbe ISSN: 2666-5247
Figure 1Study profile
Characteristics of whole-blood RNA signatures for viral infection included in analysis
| AndresTerre11 | Geometric mean of all genes (influenza meta-signature) | Five cohorts of children and adults with influenza; adults challenged with influenza; and adults with bacterial pneumonia | UK, USA, and Australia | Differential expression followed by leave-one-cohort-out strategy and filtering for heterogeneity of effect size, using genome-wide data | Eight cohorts of children or adults with influenza or bacterial infection; adults challenged with influenza; and adults vaccinated against influenza | Influenza |
| Henrickson16 | Difference in geometric means between upregulated and downregulated genes (influenza paediatric signature score) | Four cohorts of children with influenza-like illness | USA | Meta-analysis and leave-one-out strategy to identify common genes using genome-wide data | Two cohorts of children or adults with influenza | Influenza infection |
| Herberg2 | Disease risk score | Children with viral or bacterial infection | UK, USA, and Spain | Elastic net followed by forward selection–partial least squares, using significantly differentially expressed transcripts | Children with bacterial or viral infection, inflammatory disease, or indeterminate diagnosis | Viral |
| NA | Children with viral or bacterial infection | UK, USA, and Spain | Elastic net followed by forward selection–partial least squares, using significantly differentially expressed transcripts | Children with bacterial or viral infection | Viral | |
| NA | Three cohorts of adults challenged with rhinovirus, influenza, or RSV | UK and USA | Sparse latent factor regression analysis on genome-wide data | Close contacts of students with acute upper respiratory viral infections | Pre-symptomatic viral infection | |
| Lopez7 | Sum of weighted gene expression values (bacterial | Children and adults with viral, bacterial, or non-infectious acute respiratory illness | USA | Support vector machine analysis using genome-wide data | Children with acute viral or bacterial infections | Viral |
| Lydon15 | Logistic regression (viral classifier) | Adolescents and adults with viral, bacterial, or non-infectious acute respiratory illness | USA | LASSO regression analysis using 87 selected target genes from previously derived signatures | Patients with viral or bacterial co-infection or suspected bacterial infection | Viral |
| NA | NA | NA | Preselected due to biological plausibility | Adults challenged with the live yellow fever virus vaccine | Viral infection | |
| NA | Children with RSV infection | The Netherlands | Differential expression and prediction analysis of microarrays classifier training using genome-wide data | A second cohort of children with RSV infection | Severity of RSV infection in children | |
| Pennisi2 | Disease risk score | Children with viral or bacterial infection | UK, USA, and Spain | Elastic net followed by forward selection–partial least squares, using significantly differentially expressed transcripts, | Children with bacterial or viral infection | Viral |
| Sampson10 | Disease risk score (combined SeptiCyte score) | Eight cohorts of neonates, children, and adults with bacterial infections | UK, USA, Estonia, and Australia | Regression analysis of transcript pairs using the 6000 most highly expressed genes from each dataset | Unselected consecutive patients presenting to the emergency department with febrile illness | Viral |
| Sampson4 | Disease risk score (Septicyte VIRUS) | Ten cohorts of children and adults with viral infections; two cohorts of adults challenged with influenza; and two cohorts of macaques challenged with Lassa virus or lymphocytic choriomeningitis virus | USA, Brazil, Finland, and Australia | Regression analysis of transcript pairs using the 6000 most highly expressed genes from each dataset | Seven human cohorts and six non-human mammal cohorts infected or challenged with viruses across all seven of the Baltimore virus classification groups | Viral |
| Sweeney11 | Difference in geometric means between upregulated and downregulated genes, multiplied by ratio of counts of positive to negative genes (Sepsis metascore) | Nine cohorts of patients with sepsis or trauma | USA, Australia, Spain, Greece, the Netherlands, Norway, Canada, and UK | Greedy forward search of 82 differentially expressed genes identified by multicohort analysis | 12 cohorts of adults with viral or bacterial sepsis, or trauma | Viral or bacterial sepsis |
| Sweeney7 | Difference in geometric means between upregulated and downregulated genes, multiplied by ratio of counts of positive to negative genes (bacterial or viral metascore) | Eight cohorts of children and adults with viral and bacterial infections | USA, Australia, UK | Greedy forward search of 72 differentially expressed genes identified by multicohort analysis | 24 cohorts of children and adults with viral or bacterial infections, or healthy controls | Viral |
| Trouillet-Assant6 | Median expression of six interferon-stimulated genes (interferon score | NA | NA | Differential expression using 15 preselected interferon-stimulated genes | Febrile children with bacterial or viral infection | Viral |
| Tsalik33 | Logistic regression (viral ARI classifier) | Children and adults with viral, bacterial, or non-infectious acute respiratory illness, and healthy controls | USA | LASSO regression analysis using the 40% of microarray probes with the largest variance after batch correction | Five cohorts of children or adults with viral, bacterial, or non-infectious respiratory illness, or viral or bacterial co-infection | Viral |
| Yu3; | Yu3: mean expression (non-RSV infections | Children with acute respiratory illness and a positive result for a viral infection on a nasopharyngeal swab | USA | Modified supervised principal component analysis using all expressed transcripts | Children with RSV or rhinovirus infection | Viral |
| Zaas48 | Probit regression (viral classifier) | Two cohorts of adults challenged with influenza A H3N2 or H1N1 | USA | Elastic net using 48 selected genes comprised of: 29 derived as a signature in a previous study, | Adults presenting to the emergency department with fever and healthy controls | Viral |
Log2-transformed transcripts per million data were used to calculate all signatures. NA=not applicable. RSV=respiratory syncytial virus. LASSO=least absolute shrinkage selector operator. RT-LAMP=reverse transcription loop-mediated isothermal amplification.
Where applicable, the name of the signature from the original publication is indicated in brackets.
Defined as the sum of downregulated genes subtracted from the sum of upregulated genes.
Study by McClain and colleagues sought to validate a 36-transcript signature for the detection of respiratory viral infections. Model coefficients for the 36-transcript model are not provided; therefore, we included in this analysis the two best performing single transcripts from the study, since they had similar performance to the full model in the original publication.
Logistic and probit regression models were calculated on the linear predictor scale using model coefficients from original publications.
Figure 2Correlation and Jaccard indices for all eligible RNA signatures for viral infection
(A) Jaccard index intersect of constituent genes for all pairs of signatures clustered by Euclidean distance, indicating the proportion of the gene list that overlap in each pairwise comparison of signatures. The order of row labels for individual signatures is mirrored in the columns of the heatmap. (B) Spearman rank correlation coefficients for all pairs of signatures clustered by 1 – Spearman rank distance. The order of row labels for individual signatures is mirrored in the columns of the heatmap. (C) Relationship between pairwise Jaccard indices and Spearman rank correlation coefficients. (D) Network plot of significantly enriched predicted upstream regulators by cytokine, transmembrane receptors, kinase, and transcription factors of all constituent genes in any signature. The size of upstream regulator nodes is proportional to statistical enrichment. Node labels are shown for the ten most statistically enriched upstream regulators (false discovery rate <5 × 10−17). Full details of our upstream regulator analysis are in appendix 2.
Validation metrics of whole-blood RNA signatures for discrimination of participants with PCR-confirmed SARS-CoV-2 infection at first week of PCR positivity
| 0·95 (0·91–0·99) | 0·84 (0·70–0·93) | 0·95 (0·85–0·98) | .. | |
| Sweeney7 | 0·95 (0·91–0·99) | 0·82 (0·67–0·91) | 0·95 (0·85–0·98) | 0·85 |
| Zaas48 | 0·93 (0·88–0·98) | 0·61 (0·45–0·74) | 0·95 (0·85–0·98) | 0·088 |
| Pennisi2 | 0·91 (0·86–0·96) | 0·58 (0·42–0·72) | 0·95 (0·85–0·98) | 0·088 |
| 0·90 (0·84–0·96) | 0·55 (0·40–0·70) | 0·95 (0·85–0·98) | 0·039 | |
| AndresTerre11 | 0·89 (0·83–0·95) | 0·55 (0·40–0·70) | 0·95 (0·85–0·98) | 0·021 |
| Henrickson16 | 0·89 (0·82–0·96) | 0·55 (0·40–0·70) | 0·93 (0·83–0·97) | 0·0093 |
| TrouilletAssant6 | 0·87 (0·80–0·94) | 0·53 (0·37–0·68) | 0·93 (0·83–0·97) | 0·008 |
| Lydon15 | 0·86 (0·79–0·94) | 0·58 (0·42–0·72) | 0·95 (0·85–0·98) | 0·0046 |
| Herberg2 | 0·84 (0·76–0·92) | 0·5 (0·35–0·65) | 0·93 (0·83–0·97) | 0·0034 |
| Sampson4 | 0·84 (0·76–0·92) | 0·5 (0·35–0·65) | 0·93 (0·83–0·97) | 0·0027 |
| Sampson10 | 0·83 (0·74–0·92) | 0·5 (0·35–0·65) | 0·95 (0·85–0·98) | 0·0021 |
| 0·83 (0·74–0·91) | 0·47 (0·32–0·63) | 0·93 (0·83–0·97) | 0·0021 | |
| 0·82 (0·74–0·91) | 0·45 (0·30–0·60) | 0·95 (0·85–0·98) | 0·0017 | |
| Tsalik33 | 0·79 (0·70–0·89) | 0·39 (0·26–0·55) | 0·98 (0·9–1·0) | 0·0011 |
| Lopez7 | 0·79 (0·69–0·88) | 0·37 (0·23–0·53) | 0·98 (0·9–1·0) | 0·00080 |
| 0·75 (0·64–0·86) | 0·45 (0·30–0·60) | 0·93 (0·83–0·97) | 0·00027 | |
| 0·62 (0·51–0·74) | 0·03 (0·0–0·13) | 0·98 (0·9–1·0) | <0·0001 | |
| Sweeney11 | 0·60 (0·48–0·73) | 0·16 (0·07–0·30) | 0·96 (0·88–0·99) | <0·0001 |
| Yu3 | 0·59 (0·47–0·71) | 0·05 (0·01–0·17) | 1 (0·93–1·0) | <0·0001 |
Data are point estimates (95% CIs). Includes 38 contemporaneous SARS-CoV-2-positive samples and 55 SARS-CoV-2-negative samples. Discrimination is shown as AUROC. Sensitivity and specificity are shown using predefined thresholds of 2 SDs above the mean of the uninfected control population (Z2). p values show pairwise comparisons to best performing signature with Benjamini-Hochberg adjustment (false discovery rate 0·05). Equivalent data for discrimination between test-negative controls and participants with SARS-CoV-2 infection 1 week before positive PCR test are in appendix 1 (p 7). AUROC=area under the receiver operating characteristic curve.
Figure 3Four best performing RNA signatures for discriminating between controls and test-positive participants at the time of SARS-CoV-2-positive PCR test
(A) Z scores for each RNA signature in the test-negative control group and in the test-positive control group, stratified by time relative to first SARS-CoV-2-positive PCR test. Convalescent samples were collected at study week 24. AUROC (95% CI) are for discriminating between test-negative controls and test-positive participants at the time of first SARS-CoV-2-positive PCR test (0 weeks). (B) Z scores versus contemporaneous PCR cycle threshold for SARS-CoV-2 open reading frame 1, with Spearman rank correlation coefficients. AUROC=area under the receiver operating characteristic curve.